ProductDemoUse casesFAQChangelog
← Compare · vs execution observability

Flowlines vs LangSmith.

Flowlines is not a LangSmith replacement, it's a layer on top. LangSmith traces and evaluates individual LLM calls; Flowlines reads those traces and adds cross-session behavioral signals, user cohorts, and prompt-version impact. Most teams run both.

LangSmith is execution observability and evaluation: it captures what happened inside a run, lets you debug a trace, and score outputs against a test set. That's the call-level view.

Flowlines works one level up. It reads the traces you already collect (LangSmith, Langfuse, or any OTEL source) and answers questions a single trace can't: is this agent drifting week over week, which cohort of users is about to churn, did the last prompt deploy actually help. You don't replace LangSmith, you read from it.

LangSmithFlowlines
Primary jobTrace + evaluate one callBehavior across thousands of sessions
Per-call tracesYes, nativeReads them from your store
Offline evals on test setsYesNo, that's pre-ship
Cross-session signalsNoYes (hallucination, drift, frustration…)
User & cohort rollupsNoYes (happy / at-risk / churning)
Prompt-version impactLimitedBefore/after per deploy
IntegrationSDK in your agentReads your traces, no SDK
Bottom line

Use LangSmith to debug and evaluate calls before you ship. Use Flowlines to see how the agent behaves once real users hit it in production. They're complementary, and Flowlines reads straight from your LangSmith/OTEL traces so there's nothing new to instrument.