Skip to content

Plan-before-execute

Some agents emit an explicit multi-step plan and ask permission to execute it. Plan-before-execute lets the policy gate reject the whole plan before any step fires. Two practical wins:

  • Cheaper failures — no half-completed plans, no rollback needed
  • Structural rules — policies can enforce invariants on the whole plan ("≤ 50 steps", "≤ 3 writes", "must include a verification step") that no per-step rule can express

The gate dispatches to agent_plan.rego, which ships two default invariants: a hard _max_steps = 50 cap and a hard _blocked_tools = {execute_shell, run_command, drop_database, delete_all} denylist.

Build the plan

A plan is a list[dict]. The OSS rules read step.tool_name; you can populate any other keys for your own rules to match on.

python
plan = [
    {"tool_name": "search_docs",     "args": {"query": "refund policy"}},
    {"tool_name": "get_customer",    "args": {"customer_id": "c1"}},
    {"tool_name": "approve_refund",  "args": {"customer_id": "c1", "amount": 50}},
]

Evaluate via PolicyGate.evaluate_plan

The lowest-level path:

python
decision = await gate.evaluate_plan(plan, context)

if decision.allow and not decision.deny:
    # Plan passes — start executing step by step.
    for step in plan:
        await toolbox.call(step["tool_name"], step["args"])
else:
    # Plan rejected — surface decision.reason to the agent or operator.
    print(f"Plan denied: {decision.reason}")

evaluate_plan constructs a GovernanceEvent(event_type="agent.plan", steps=plan, …) and runs it through the standard pipeline. Returns a PolicyDecision; does not raise on deny — the caller decides what to do with the result.

Evaluate via GovernedToolbox.evaluate_plan

A thin wrapper that does raise:

python
from kitelogik.governed import GovernanceError

try:
    await toolbox.evaluate_plan(plan)
except GovernanceError as exc:
    print(f"Plan rejected: {exc}")
    return

# Allowed — execute step by step
for step in plan:
    await toolbox.call(step["tool_name"], step["args"])

The toolbox version raises GovernanceError if decision.deny is true or decision.allow is false. Use this when you want plan rejection to short-circuit your code naturally.

Plan-level invariants you can write

The OSS module covers step-count and blocked-tool checks. Common project-specific invariants people add:

rego
package kitelogik.agent_plan

import future.keywords.if
import future.keywords.in

# Cap write operations per plan
_write_tools := {"approve_refund", "send_notification", "write_memory"}

deny if {
    input.event_type == "agent.plan"
    write_count := count([s | s := input.steps[_]; s.tool_name in _write_tools])
    write_count > 3
}

# Require plan to start with a read
deny if {
    input.event_type == "agent.plan"
    count(input.steps) > 0
    input.steps[0].tool_name in {"approve_refund", "send_notification"}
}

# Escalate plans that touch high-value transactions
requires_hitl if {
    input.event_type == "agent.plan"
    some step in input.steps
    step.tool_name == "approve_refund"
    is_number(step.args.amount)
    step.args.amount > 1000
}

Plan rules don't replace per-step rules

Plan-level allow is necessary but not sufficient for execution. Every step still runs through the per-tool_call gate when it actually fires. Use plans for shape ("≤ 3 writes", "must include a check") and tool calls for authority ("this user can approve up to $100").

Pattern: replan on deny

For agents that produce candidate plans and refine them based on policy feedback:

python
async def execute_with_planner(planner, max_attempts=3):
    for attempt in range(max_attempts):
        plan = await planner.propose()
        decision = await gate.evaluate_plan(plan, context)

        if decision.allow:
            return await execute_plan(plan)

        # Feed the deny reason back to the planner so it can revise
        await planner.feedback(
            f"Plan rejected by policy: {decision.reason} "
            f"(rule: {decision.rule_matched})"
        )

    raise GovernanceError("Planner could not produce an allowed plan",
                          decision=decision)

What gets logged

Every plan evaluation produces a single PolicyDecision event in the audit log with:

  • event_type = "agent.plan"
  • steps (the proposed sequence)
  • decision.allow / deny / requires_hitl
  • decision.rule_matched (which Rego rule fired)
  • policy_version (from the loaded bundle)

Per-step audit happens later, when each step actually fires as a tool_call.

Released under the Apache 2.0 License.