Reference Architecture and Local Setup

This chapter creates the local foundation used by the implementation chapters. It is not a production deployment guide. The local Langfuse stack exists so the reader can inspect the full telemetry path without creating a hosted observability account or paying for a backend.

The implementation module uses a small order-support agent. It retrieves policy snippets, reads order state through a tool, calls the OpenAI Responses API, records OpenTelemetry spans, and exports traces through a local Collector into Langfuse.

The domain is intentionally small. The boundaries are not. This setup includes the pieces that make agent observability difficult in production: model calls, retrieval, tools, branching, runtime budgets, content policy, telemetry export, and backend inspection.

What we will change

Create and work in the demo project:

mkdir agent-observability-demo
cd agent-observability-demo

This chapter touches these files and directories:

File or directory	What to do
`requirements.txt`	Create the Python dependency list.
`requirements.lock.txt`	Generate the pinned local dependency lock.
`.env.example` and `.env`	Create the safe configuration template and local copy.
`.gitignore`	Ignore secrets, virtual environments, and cache files.
`src/agent_observability/config.py`	Create the typed settings loader.
`infrastructure/langfuse/`	Clone the official local Langfuse stack.

Architecture

The runtime path and the telemetry path are easier to read separately.

The runtime path is the agent execution:

The telemetry path is the observability pipeline:

The application exports OTLP to the Collector. It does not send telemetry directly to Langfuse. This keeps filtering, batching, sampling, and backend credentials outside application code.

The local setup has three important boundaries:

Boundary	Why it exists
Application to Collector	Application code emits OTLP; Collector owns processing and export.
Collector to Langfuse	Backend credentials stay in Collector configuration.
Agent data to telemetry	Content capture remains disabled unless a later chapter explicitly enables it.

Project layout

Create the demo project with this shape:

agent-observability-demo/
├── .env.example
├── .gitignore
├── collector-config.yaml
├── compose.yaml
├── requirements.txt
├── requirements.lock.txt
├── infrastructure/
│   └── langfuse/
├── src/
│   └── agent_observability/
│       ├── __init__.py
│       ├── config.py
│       ├── graph.py
│       ├── inference.py
│       ├── main.py
│       ├── retrieval.py
│       ├── telemetry.py
│       └── tools.py
└── tests/
    ├── test_graph.py
    ├── test_privacy.py
    └── test_telemetry.py

The layout separates application code, telemetry wiring, local infrastructure, and tests. Later chapters fill these files in one layer at a time.

Create the Python environment

Start with an isolated virtual environment:

mkdir agent-observability-demo
cd agent-observability-demo
python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip

Create requirements.txt:

langgraph>=1.0,<2
openai>=2,<3
opentelemetry-api>=1.39,<2
opentelemetry-sdk>=1.39,<2
opentelemetry-exporter-otlp-proto-http>=1.39,<2
pydantic>=2.10,<3
pydantic-settings>=2.7,<3
pytest>=8,<9

Then install and lock the local run:

python -m pip install -r requirements.txt
python -m pip freeze > requirements.lock.txt

The version ranges keep the tutorial readable. The generated lock file makes the local run reproducible. Production projects should use their established dependency, vulnerability, and upgrade process.

Configuration

Create .env.example:

OPENAI_API_KEY=replace-me
OPENAI_MODEL=replace-with-current-responses-model
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://localhost:4318/v1/traces
OTEL_SERVICE_NAME=order-support-agent
DEPLOYMENT_ENVIRONMENT=development
AGENT_VERSION=local
CAPTURE_CONTENT=false
MAX_AGENT_ITERATIONS=6
MAX_MODEL_CALLS=4

Set OPENAI_MODEL to a current Responses API model available to the account. The rest of the series treats the model value as configuration, not as part of the observability model.

Copy it locally and keep secrets out of Git:

cp .env.example .env

.env
.venv/
__pycache__/
.pytest_cache/

Create src/agent_observability/config.py:

from pydantic import Field
from pydantic_settings import BaseSettings, SettingsConfigDict


class Settings(BaseSettings):
    model_config = SettingsConfigDict(env_file=".env", extra="ignore")

    openai_api_key: str
    openai_model: str
    otel_exporter_otlp_traces_endpoint: str = (
        "http://localhost:4318/v1/traces"
    )
    otel_service_name: str = "order-support-agent"
    deployment_environment: str = "development"
    agent_version: str = "local"
    capture_content: bool = False
    max_agent_iterations: int = Field(default=6, ge=1, le=20)
    max_model_calls: int = Field(default=4, ge=1, le=20)


settings = Settings()  # pyright: ignore[reportCallIssue]

Start Langfuse locally

This setup starts Langfuse locally so the rest of the series can show the full telemetry path without requiring a hosted account. It is a learning environment, not a production recommendation.

Use the current official self-hosting files instead of copying an abbreviated Compose configuration into the article. The official Docker Compose guide is the source of truth if file names or required variables change.

git clone --depth 1 https://github.com/langfuse/langfuse.git infrastructure/langfuse
cd infrastructure/langfuse
cp .env.prod.example .env

Replace every secret marked CHANGEME, then start the stack:

docker compose up -d
docker compose ps

Open http://localhost:3000, create a project, and create API keys.

Docker Compose is appropriate for local learning and integration troubleshooting. Do not treat this Compose stack as a production observability platform. Production use needs a separately designed deployment model with high availability, backups, scaling, secret management, network controls, retention, and upgrade procedures.

OpenAI data storage setting

The implementation uses the Responses API with store=False. OpenAI documents that Responses are stored by default unless storage is disabled. Response storage is separate from telemetry content capture: disabling OpenAI response storage does not prevent our own application, Collector, or backend from exporting content.

Keep the controls separate:

Control	What it affects
`store=False` on OpenAI Responses API calls	OpenAI response storage for that API request, subject to organization data controls.
`CAPTURE_CONTENT=false` in this project	Whether the demo instrumentation exports prompts, outputs, tool payloads, or retrieved text.
Collector filtering and backend configuration	What telemetry is processed, exported, stored, and retained after the SDK emits it.

All three must be configured deliberately.

Verify prerequisites

Return to the demo project root before running these checks:

cd ../../

Verify Python dependencies:

python -c "import langgraph, openai, opentelemetry; print('dependencies ok')"

Verify Langfuse health:

curl -fsS http://localhost:3000/api/public/health

Verify the local safety defaults:

PYTHONPATH=src python - <<'PY'
from agent_observability.config import settings

assert settings.capture_content is False
assert settings.deployment_environment == "development"
assert settings.max_agent_iterations <= 20
assert settings.max_model_calls <= 20
print("configuration ok")
PY

Do not continue if the backend is unhealthy, placeholder Langfuse secrets remain in the Compose environment, or .env contains a real API key that is tracked by Git.

What should exist before we go to Chapter 13

At the end of this chapter, the local environment should have:

A Python project with dependencies installed and locked.
A .env file outside version control.
A configuration object that keeps secrets out of telemetry.
A local Langfuse stack reachable at http://localhost:3000.
A Langfuse project with API keys ready for the Collector.
Content capture disabled by default.

Chapter 13 uses this foundation to configure the OpenTelemetry SDK and Collector.

References

Next up: Ch 13 - Building the OpenTelemetry Pipeline sends one verified trace through the Collector into Langfuse.