OpenTelemetry
The gate, the adapter base, and the agent session loop all emit OpenTelemetry spans. Every span carries kitelogik.* semantic attributes describing the policy decision, the risk tier, the session, and (where relevant) the delegation lineage — so a trace view shows the full why of any allow / deny / HITL outcome, not just timing.
Spans the runtime emits
| Span name | Where | Carries |
|---|---|---|
kitelogik.adapter.tool_call | every adapter (OpenAIAdapter, LangChainAdapter, …) | kitelogik.tool.name, kitelogik.tool.action, kitelogik.session_id, kitelogik.allowed |
| Gate evaluation spans | PolicyGate.evaluate_tool_call / evaluate_event | kitelogik.tool_name, kitelogik.action, kitelogik.session_id, kitelogik.user_role, kitelogik.policy.allow, kitelogik.policy.deny, kitelogik.policy.risk_tier, kitelogik.policy.requires_hitl, kitelogik.policy.reason, kitelogik.event_type |
| Sanitiser span | tool-output scrubbing | kitelogik.sanitizer.was_modified |
| Agent session spans | agents/session.py | kitelogik.session_id, kitelogik.user_role, kitelogik.token_id, kitelogik.delegation_depth, kitelogik.iteration |
Spans nest naturally: the adapter span wraps the gate span, which wraps any downstream sanitiser span. A multi-agent run produces a tree where the parent session's span contains the delegated children's spans, joined by trace-id propagation through delegation_depth / parent_token_id.
Setup
from kitelogik.observability.tracer import setup_tracer
setup_tracer(service_name="my-agent")That's the whole wire-up. With no further arguments, the runtime:
- creates a
TracerProviderresource taggedservice.name="my-agent" - installs an always-on in-memory exporter (capacity 500 spans, most-recent kept)
- installs a
BatchSpanProcessor(ConsoleSpanExporter)writing to stderr so it doesn't clobber demo stdout
Subsequent setup_tracer(...) calls re-initialise — the in-memory buffer clears and any previous file handle closes cleanly.
Send to an OTLP collector
For production observability stacks (Tempo, Jaeger, Honeycomb, Grafana Cloud, Datadog, …), pass an OTLP HTTP endpoint:
setup_tracer(
service_name="my-agent",
otlp_endpoint="http://tempo.observability.svc:4318",
)The runtime appends /v1/traces and adds an OTLPSpanExporter through a BatchSpanProcessor. The in-memory exporter stays on, so the in-process dashboard panel keeps working alongside the external collector.
OTLP HTTP, not gRPC
The shipped exporter is opentelemetry.exporter.otlp.proto.http. If your collector is gRPC-only, point it at a Tempo / OTel-Collector sidecar that accepts HTTP and re-exports gRPC.
Send to a local file
For replayable local traces (or shipping to log-based observability):
setup_tracer(service_name="my-agent", trace_file="traces.jsonl")Spans land in JSON-lines, one per finished span, via a batched ConsoleSpanExporter writing to the file handle. Re-calling setup_tracer cleanly closes the previous handle.
Test mode
pytest runs that import the runtime should pass testing=True to suppress all file/network export — only the in-memory buffer stays active:
setup_tracer(service_name="kitelogik-tests", testing=True)This is what the bundled tests use. The buffer clears on every setup_tracer call, so tests don't leak spans across cases.
Read finished spans in-process
The in-memory exporter exposes a JSON-friendly snapshot:
from kitelogik.observability.tracer import get_finished_spans
spans = get_finished_spans()
# [
# {
# "trace_id": "00000000000000000000000000000001",
# "span_id": "0000000000000001",
# "parent_span_id": "...",
# "name": "kitelogik.adapter.tool_call",
# "start_ms": 1714291200000,
# "end_ms": 1714291200042,
# "duration_ms": 42,
# "status": "OK",
# "session_id": "sess_001",
# "user_role": "support_agent",
# "tool_name": "approve_refund",
# "policy_allow": False,
# "policy_deny": True,
# "policy_hitl": False,
# "risk_tier": "TRANSACTIONAL_HIGH",
# "reason": "Refunds over $100 require manager approval",
# ...
# },
# ...
# ]It's just stdlib OTel and a flat dict — fine to render in your own UI, emit to a metrics pipeline, or assert against in tests. The buffer holds the most recent 500 finished spans; older spans are silently dropped.
Pattern: alert on deny rate
Every gate-evaluation span sets kitelogik.policy.deny. With a collector in front of your trace store, span-metrics or trace-to-metrics rules turn that boolean into a counter:
# OTel Collector — spanmetrics processor (excerpt)
processors:
spanmetrics:
metrics_exporter: prometheus
dimensions:
- name: kitelogik.tool_name
- name: kitelogik.policy.risk_tier
- name: kitelogik.policy.denyThen a Prometheus query like:
sum by (tool_name) (
rate(traces_spanmetrics_calls_total{kitelogik_policy_deny="true"}[5m])
)…surfaces a denial rate per tool, broken down by risk tier. A sudden spike on kitelogik_policy_deny="true" for a previously-quiet tool is your signal that either an agent is misbehaving or a policy change landed and the agent loop is now hitting it constantly.
Pattern: trace-link an audit record
Every audit record carries the session_id; every span carries kitelogik.session_id. Drop both on the same record in your log shipping pipeline and a click on a denied call in the audit log opens the full trace:
log.warning(
"denied",
extra={
"session_id": record.session_id,
"tool_name": record.tool_name,
"trace_id": trace.get_current_span().get_span_context().trace_id,
"policy_version": record.policy_version,
},
)In Grafana / Loki this is the standard derived-fields trick: regex out trace_id from the log line and link it into the Traces UI.
Rendering a dashboard
A trace dashboard on top of this is a poll loop: read get_finished_spans(), group by kitelogik.session_id, render a waterfall per session — coloured by allow / deny / HITL — and the operator sees the full agent turn (model → adapter span → gate span → sanitiser span) in one view. The data is the stdlib OTel dict; the dashboard is just a renderer.
Related
- Performance benchmarks — measure per-call gate latency; the spans give you the same number in production
- Audit trail export — durable record of every decision; pair with traces for full forensics
- Architecture — where the tracer sits alongside the gate and the audit store