Engineering

How to integrate Flowlines in 5 minutes

Add behavioral observability and structured memory to any Python AI agent. Install the SDK, init before your LLM client, wrap calls in context, and retrieve memory. Works with OpenAI, Anthropic, and any agent framework.

Alexandre Ayoub

Founder · Mar 15, 2026 · 5 min

Most observability tools require you to rearchitect your agent. Flowlines does not. You install an SDK, call init, and your LLM calls are automatically captured. Here is how it works.

Step 1: Install the SDK

Flowlines requires Python 3.10+. Install it with the extras for whatever LLM provider you use:

pip install "flowlines[openai]"

You can install multiple providers at once:

pip install "flowlines[openai,anthropic]"

Or install everything:

pip install "flowlines[all]"

Flowlines supports OpenAI, Anthropic, Google, Bedrock, Cohere, Groq, Mistral, Together, Ollama, and dozens more. It also supports frameworks like LangChain, LlamaIndex, CrewAI, and Haystack out of the box.

Step 2: Initialize before your LLM client

This is the one thing you need to get right. flowlines.init() must be called before you create your LLM client. If the client already exists, its calls will not be captured.

import flowlines
from openai import OpenAI

flowlines.init(api_key="your-flowlines-api-key")
client = OpenAI()

That is it. Every LLM call made through that client is now automatically traced and exported to Flowlines. No decorators. No wrappers. No manual span creation. It works through OpenTelemetry instrumentation under the hood.

Step 3: Add context to your calls

Wrap your LLM calls in flowlines.context() so Flowlines knows which user and session each call belongs to:

with flowlines.context(user_id="user-42", session_id="sess-abc"):
    response = client.chat.completions.create(
        model="gpt-4",
        messages=messages
    )

This is what enables Flowlines to detect behavioral signals per user, track agent drift across sessions, and build structured memory over time.

If a context manager does not fit your architecture, there is an imperative API:

token = flowlines.set_context(user_id="user-42", session_id="sess-abc")
try:
    client.chat.completions.create(...)
finally:
    flowlines.clear_context(token)

Step 4: Retrieve and inject memory

This is where Flowlines diverges from pure observability tools. You can retrieve what Flowlines has learned about a user and inject it into your prompt:

memory = flowlines.get_memory("user-42", session_id="sess-abc")

messages = [{"role": "system", "content": "You are a helpful assistant."}]

if memory:
    messages.append({
        "role": "system",
        "content": f"Here is what you know about this user from previous conversations:\n{memory}",
    })

messages.append({"role": "user", "content": user_input})

Memory is not a vector search. It is structured knowledge that Flowlines has extracted from real interactions: user preferences, past corrections, behavioral patterns, constraints the user has expressed over time.

The full picture

Here is a complete working example:

import flowlines
from openai import OpenAI

flowlines.init(api_key="your-flowlines-api-key")
client = OpenAI()

user_id = "user-42"
session_id = "sess-abc"

memory = flowlines.get_memory(user_id)

messages = [{"role": "system", "content": "You are a helpful assistant."}]
if memory:
    messages.append({
        "role": "system",
        "content": f"Here is what you know about this user:\n{memory}",
    })
messages.append({"role": "user", "content": "Hello!"})

with flowlines.context(user_id=user_id, session_id=session_id):
    response = client.chat.completions.create(
        model="gpt-4", messages=messages
    )

flowlines.end_session(user_id=user_id, session_id=session_id)

What you will see in the dashboard

Run your application, make a few LLM calls, then open your Flowlines dashboard. Within seconds, your first traces appear.

Each trace shows the full lifecycle of an LLM call: the model used, input tokens, output tokens, latency, and the user and session context you attached. You can drill into any trace to see the exact messages sent and received.

As interactions accumulate, the Signals view lights up. Flowlines starts surfacing behavioral patterns: which sessions show signs of drift, where context loss occurs most often, which users trigger the most corrections. These are the silent failures your logs miss.

The Memory tab shows what Flowlines has structured per user. Preferences extracted from real conversations. Corrections the user made that your agent should remember. Constraints expressed across sessions. This memory is what gets returned when you call flowlines.get_memory().

You do not configure any of this. No rules to write. No thresholds to set. The observation layer does the work.

What happens next

Once traces start flowing, Flowlines automatically detects behavioral signals: agent drift, context loss, user frustration, intent shifts. It builds structured memory per user. Your agent gets better over time without you manually updating prompts.

The integration takes 5 minutes. The compounding takes weeks. But it starts the moment you ship that first flowlines.init().

Alexandre Ayoub · Founder

Building Flowlines, behavioral observability for production AI agents. See the failures no one reported.

Book a demo

Keep reading

Architecture

What is behavioral observability?

Behavioral observability is the practice of detecting how an AI agent behaves across sessions and users, not just whether each LLM call succeeded. Here is the definition, the signals, and how it differs from execution observability.

Apr 25 · 9 min

Engineering

How to detect agent drift in production

Agent drift is the failure mode every AI team talks about and nobody measures. Here is how to detect it, which signals matter, and how structured memory stops it.

Apr 10 · 10 min

Opinion

Flowlines vs. Mem0: why memory needs observability

Comparing Flowlines and Mem0 for AI agent memory. Mem0 is a write API. Flowlines builds memory from observed behavior. Here is when each one makes sense.

Mar 8 · 7 min