Resource budgets
Three independent budgets per session, all on SessionContext:
| Total field | Used field | Unit |
|---|---|---|
budget_total_tokens | budget_used_tokens | LLM tokens |
budget_total_api_calls | budget_used_api_calls | Tool / API invocations |
budget_total_cost_cents | budget_used_cost_cents | Currency (integer cents) |
When any budget shows used >= total, the next governed event is denied — by agent_budget.rego.
A budget is "set" when the corresponding total_* field is non-null. When all total_* are null, no budget is enforced.
Set a budget
from kitelogik import SessionContext
context = SessionContext(
session_id="sess_001",
user_role="support_agent",
session_scopes=["read_customer", "approve_refund"],
budget_total_tokens=100_000,
budget_used_tokens=0,
budget_total_api_calls=200,
budget_used_api_calls=0,
# cost cents omitted — no cost budget for this session
)You can mix and match — only set the budgets you want enforced. The others stay null and short-circuit to allow.
Update counters between gate calls
SessionContext is treated as immutable for the duration of a session, so increment via model_copy(update=...):
# After a model interaction reports usage
context = context.model_copy(update={
"budget_used_tokens": context.budget_used_tokens + tokens_this_call,
"budget_used_api_calls": (context.budget_used_api_calls or 0) + 1,
})
# Pass the updated context to the next gate evaluation
result = await toolbox.call("approve_refund", {...}) # uses the new contextIf your runtime stores SessionContext once and passes it everywhere (common in framework adapters), update the stored reference after every significant call.
Where the gate fires
Per agent_budget.rego, budget rules deny on two event types:
# Explicit agent.budget events
deny if { input.event_type == "agent.budget"; _token_budget_exhausted }
# Opportunistic enforcement on every tool call
deny if { input.event_type == "tool_call"; _token_budget_exhausted }The second rule is the important one — you don't have to fire explicit agent.budget events to get enforcement. As long as SessionContext carries up-to-date budget_used_* values, every governed tool_call checks the limits.
Pattern: token budget from an OpenAI response
The cleanest pattern is to keep SessionContext on the same object the gate evaluates against — typically a GovernedToolbox, where the context is evaluated per-call:
from openai import AsyncOpenAI
from kitelogik import GovernedToolbox
client = AsyncOpenAI()
toolbox = GovernedToolbox(gate=gate, context=context)
# ... register tools ...
while True:
response = await client.chat.completions.create(
model="gpt-4o",
messages=messages,
# ... tools, etc. ...
)
# Update the budget counters from the usage block
if response.usage:
context = context.model_copy(update={
"budget_used_tokens":
(context.budget_used_tokens or 0) + response.usage.total_tokens,
"budget_used_api_calls":
(context.budget_used_api_calls or 0) + 1,
})
toolbox = GovernedToolbox(gate=gate, context=context) # rebuild with updated context
# re-register tools on the rebuilt toolbox here
if response.choices[0].finish_reason != "tool_calls":
break
# ... dispatch tool calls through `toolbox.call(...)`, append results ...When budget_used_tokens >= budget_total_tokens, the next toolbox.call(...) will be denied — agent_budget.rego checks the budget on every tool_call event regardless of which tool is being called.
TIP
Framework adapters (OpenAIAdapter, the LangChain helpers, etc.) take the context at construction time. To apply an updated SessionContext to a running adapter, the supported path is to construct a new adapter and re-register the tools — the adapter's internal context is not part of its public API.
Pattern: cost from a token-based price model
If you know the model's per-token price, derive the cost budget from the token usage:
PRICE_INPUT_CENTS_PER_1K = 0.250
PRICE_OUTPUT_CENTS_PER_1K = 1.000
def call_cost_cents(usage):
return int(
usage.prompt_tokens * PRICE_INPUT_CENTS_PER_1K / 1000 +
usage.completion_tokens * PRICE_OUTPUT_CENTS_PER_1K / 1000
)
context = context.model_copy(update={
"budget_used_cost_cents":
(context.budget_used_cost_cents or 0) + call_cost_cents(response.usage),
})The integer-cents convention avoids floating-point drift over long sessions.
Pattern: per-role budget caps
Combine agent_budget.rego with a custom rule that checks role:
package kitelogik.agent_budget
import future.keywords.if
# Cap guest role at $5 cost regardless of explicit budget
deny if {
input.context.user_role == "guest"
input.context.budget_used_cost_cents != null
input.context.budget_used_cost_cents > 500
}
# Cap untrusted-tier sessions at 50K tokens
deny if {
input.context.user_role in {"guest", "anonymous"}
input.context.budget_used_tokens != null
input.context.budget_used_tokens > 50000
}Same package, denies merge across files — the stricter rule wins.
Pattern: report budget exhaustion to the user
Catch the deny and surface a friendly message rather than a generic GovernanceError:
from kitelogik import GovernanceError
try:
result = await toolbox.call("approve_refund", {...})
except GovernanceError as exc:
if "budget" in exc.decision.reason.lower():
return f"Session budget exhausted ({context.budget_used_tokens} tokens used)."
raiseThe decision.reason from main.rego's deny propagation carries the sub-policy's reason, so a budget deny shows up identifiably.
What is NOT enforced by the gate
- The
used_*counters incrementing themselves. Your runtime owns this. The gate trusts what you put inSessionContext. - Cross-agent / org-wide budgets. Budgets in the bundled policy are per session. Aggregating spend across agent sessions for an org-wide cap is your orchestrator's job — pass the aggregated counter into
SessionContextper session and letagent_budget.regoenforce. - Soft warnings before hard exhaustion. The gate is allow-or-deny. Implement warnings in your runtime by checking the ratio yourself.
Related
agent_budget.rego— the bundled budget policy- Governance events —
SessionContextfield reference