ADR-0002: Rule-Based Planner for V1¶
Status: Accepted Date: 2026-06-22 Deciders: Paxman core team Supersedes: — Superseded by: —
Context and Problem Statement¶
Paxman must synthesize a deterministic execution plan. Two architectural options for V1 exist:
- Rule-based planner — a fixed heuristic chain (explicit evidence → local deterministic → ... →
UNRESOLVED) that picks the next capability based on contract field metadata, input profile, budget, and policy. - LLM-based planner — an LLM decides which capabilities to invoke for each field.
How should the V1 planner be implemented?
Decision Drivers¶
- Determinism is required (PRD §4.5) — given the same inputs, the planner must produce the same plan.
- No AI in the V1 critical path (PRD §7.6) — V1 planning must be rule-based.
- Tiny API (PRD §4.7) — the caller should not need to configure a model.
- Cost is first-class (PRD §4.6) — using an LLM to plan is expensive.
- Replayability (PRD §7.4) — plans must be reproducible from contract + input + budget + policy.
Considered Options¶
Option A — Rule-based planner (chosen)¶
A fixed heuristic chain. For each required field, the planner evaluates the chain in order and selects the first applicable capability that fits the budget and policy.
Pros:
- Deterministic by construction.
- Cheap (no inference cost).
- Easy to test and reason about.
- Plays well with replay.
Cons:
- Less flexible than an LLM-based planner.
- May miss non-obvious plans for unusual input types.
- Requires hand-tuning the heuristic chain.
Option B — LLM-based planner¶
Use an LLM to generate the FieldPlan[] from the contract and input profile.
Pros:
- Flexible; can plan for unusual input types.
- Could discover novel capability combinations.
Cons:
- Non-deterministic by default.
- Expensive (one inference per field, at minimum).
- Adds a provider dependency to the core.
- Violates PRD §7.6 (no LLM in the V1 critical path).
- Replay is harder (plan may differ between runs).
Option C — Hybrid: rule-based with LLM refinement¶
Run the rule-based planner first; if the plan is "low confidence," ask an LLM to refine it.
Pros:
- Best of both worlds in theory.
Cons:
- Doubles complexity.
- The "low confidence" threshold is itself heuristic.
- Replay is harder.
Decision Outcome¶
Chosen option: A (Rule-based). The V1 planner is a pure function over (canonical contract, input profile, budget, policy, capability registry). The heuristic chain is documented in ARCHITECTURE.md §4.2 and PACKAGE_STRUCTURE.md §4.
LLM-based planning is postponed to V2 (see ARCHITECTURE.md §17.2).
Consequences¶
Positive¶
- Determinism is automatic.
- No inference cost in the planner.
- Replay is straightforward.
- Easy to add a new capability without retraining a planner.
Negative¶
- The heuristic chain is hand-tuned. V2 may add learned heuristics on top.
- Unusual input/contract combinations may produce sub-optimal plans.
Neutral¶
- The
Heuristicprotocol is exposed as a post-V1 extension point.
Validation¶
- Property tests verify that the same inputs produce the same plan.
- The planner test suite covers every combination of input profile and contract shape.
- Determinism tests assert byte-equal
ExecutionPlanJSON for the same inputs.
References¶
- PRD.md §4.5, §7.6
- ARCHITECTURE.md §4.2, §17.1
- PACKAGE_STRUCTURE.md §4