Paxman Architecture¶

Status: Draft v2 (post-documentation review) Audience: Engineers implementing or extending Paxman. Related docs: PACKAGE_STRUCTURE.md, GLOSSARY.md, docs/adr/

This document describes the system architecture of Paxman. It is implementation-agnostic at the level of "what subsystems exist" and "what their boundaries are," but it is concrete about why the boundaries are where they are. For module-level implementation rules, see PACKAGE_STRUCTURE.md.

1. Architecture Overview¶

Paxman Core consists of seven major subsystems, plus the public api/ surface. Each subsystem has strict, well-defined responsibilities. This separation preserves determinism, debuggability, and architectural clarity.

                ┌─────────────────────┐
                │       api/          │   ← The only thing users see
                └──────────┬──────────┘
                           │
        ┌──────────────────┼──────────────────┐
        ▼                  ▼                  ▼
  ┌───────────┐      ┌───────────┐      ┌───────────────┐
  │ contract/ │      │ planner/  │      │ capabilities/ │
  │ adapters  │ ───▶ │ field plan│ ───▶ │ atomic ops    │
  │ validator │      │ synthesis │      │ (V1 surface)  │
  └───────────┘      └─────┬─────┘      └───────┬───────┘
                           │                    │
                           ▼                    ▼
                      ┌───────────┐        ┌───────────┐
                      │ executor/ │ ─────▶ │ reconciler│
                      │ runs plan │        │ / merge / │
                      └─────┬─────┘        │ confidence│
                            │              └─────┬─────┘
                            │                    │
                            └────────┬───────────┘
                                     ▼
                              ┌───────────┐
                              │ artifact/ │
                              │ final     │
                              │ bundle    │
                              └───────────┘

Subsystem count clarification: Paxman has seven internal subsystems (contract, planner, capabilities, executor, reconciler, artifact, api). Earlier drafts said "six" because the public flow diagram in §1 omitted capabilities/ (treated as the Executor's toolset) and api/ (treated as a presentation layer). This document treats all seven as first-class subsystems to match the actual package layout in PACKAGE_STRUCTURE.md.

#	Subsystem	One-line responsibility
1	`contract/`	Translate and validate caller contracts into the canonical internal form.
2	`planner/`	Build a deterministic, field-by-field execution plan.
3	`capabilities/`	Provide atomic, reusable operations. LLMs live behind inference providers.
4	`executor/`	Run the plan; collect evidence; stop early when satisfied.
5	`reconciler/`	Merge candidates, assign final confidence, resolve truth.
6	`artifact/`	Freeze the evidence-backed, replayable output.
7	`api/`	Expose the tiny public surface.

2. High-Level Flow¶

sequenceDiagram
    autonumber
    participant C as Caller
    participant A as api.normalize()
    participant CO as contract/
    participant P as planner/
    participant E as executor/
    participant CAP as capabilities/
    participant R as reconciler/
    participant AR as artifact/

    C->>A: normalize(input_data, contract, budget, policy)
    A->>CO: adapt + validate(contract)
    CO-->>A: CanonicalContract (or INVALID_CONTRACT)

    A->>P: plan(canonical_contract, input_profile, budget, policy)
    P-->>A: ExecutionPlan (FieldPlan[])

    A->>E: run(plan, canonical_contract, capabilities)
    loop for each FieldPlan
        E->>CAP: invoke(capability, field_context)
        CAP-->>E: candidate(s) + evidence
    end
    E-->>A: CandidateResult[] (per field)

    A->>R: reconcile(candidates, contract)
    R-->>A: ResolvedResult[] (per field, with confidence)

    A->>AR: build(resolved, plan, evidence)
    AR-->>A: ExecutionArtifact (with replay_hash)

    A-->>C: ExecutionArtifact

Step-by-step:

Receive input_data and target_contract.
Canonicalize the contract via the contract/ adapter subsystem.
Validate the contract via the contract/ validator.
Analyze the input profile (lightweight classification; no capability invocation).
Plan a deterministic, field-centric execution plan.
Execute capabilities in plan order, collecting evidence and diagnostics.
Reconcile candidate truths: merge, detect conflicts, assign confidence, resolve.
Build the artifact: normalized data + evidence + diagnostics + replay data.
Return the artifact. Paxman does no further work.

3. Core Architectural Principles¶

3.1 Contract externalization¶

Paxman remains agnostic about where the contract came from. It only needs a canonical internal representation. Every adapter translates into and out of the same CanonicalContract.

3.2 Deterministic planning¶

Given the same input profile, canonical contract, configuration, and capability set, the planner must choose the same plan. The planner is a pure function over (canonical contract, input profile, configuration, capability registry). No LLM, no random, no clock.

3.3 Isolation of concerns¶

Contract defines what output must look like.
Planner defines what to do.
Executor defines how to run it.
Reconciler defines what is ultimately true.
Artifact defines what was produced.
API defines what the user can ask for.

Subsystem boundaries are enforced by lint rules and review. See PACKAGE_STRUCTURE.md §8 System Boundary Rules for the formal rules and how to enforce them.

3.4 Field-centric, not document-centric¶

Paxman plans resolution independently for each required field. There is no "parse the document" mega-step. This enables cost optimization, targeted inference, and selective capability invocation. See ADR-0001.

3.5 Capability confidence rule¶

Capabilities never assign confidence. They return candidates + evidence + diagnostics. Confidence is assigned only by the Reconciler; the Planner emits a target_confidence (read from the field's confidence_threshold) but never scores a candidate. This prevents confidence inflation. See ADR-0005.

3.6 The pipeline is synthesized, not fixed¶

Paxman does not have a single canonical pipeline. It synthesizes a plan per (contract, input) pair. Different inputs may use different capabilities for the same field. See ADR-0002.

4. Subsystem Specification¶

4.1 `contract/` — Translation + Validation Boundary¶

The only layer that knows about external contract formats. Adapters produce a CanonicalContract; the validator rejects anything invalid with INVALID_CONTRACT.

Inputs: a caller-supplied contract (Pydantic model, JSON Schema, Dict DSL, or OpenAPI). Outputs: a CanonicalContract (validated) or an INVALID_CONTRACT error with structured details.

Responsibilities:

Structural conversion to the canonical form.
Semantic annotation (e.g., "this string is an ISO-4217 currency code").
Field normalization (paths, names, constraints).
Constraint translation.
Validation: types, constraints, paths, semantic tags, confidence thresholds.

Supported V1 types: STRING, INTEGER, DECIMAL, BOOLEAN, DATE, ENUM, OBJECT, ARRAY, MONEY. (See GLOSSARY.md for definitions.)

Why this is a hard boundary: the planner, executor, reconciler, and artifact subsystems never see Pydantic, JSON Schema, or OpenAPI. This is what makes Paxman contract-format-agnostic. See ADR-0004 for the rationale on first-class types and ADR-0007 for the V1 adapter set.

4.2 `planner/` — Field-Centric Plan Synthesis¶

Deterministic, rule-based. Reads the canonical contract, analyzes the input profile, produces a FieldPlan per required field.

Inputs: CanonicalContract, InputProfile, Budget, Policy, CapabilityRegistry. Outputs: ExecutionPlan = ordered list of FieldPlans.

Heuristic ordering (highest to lowest preference):

Explicit evidence (already present in input)
Local deterministic extraction (regex, parser)
Structured lookup (deterministic table join)
Derived computation (formula over resolved fields)
Local inference (small local model)
Remote inference (LLM)
UNRESOLVED (terminal)

This ordering is a default; the planner can be overridden per contract via ResolutionPolicy on a field. See ADR-0002.

FieldPlan shape (informal; see PACKAGE_STRUCTURE.md §planner for module-level details):

FieldPlan(
    field_id: str,
    capability_chain: list[CapabilityInvocation],
    early_stop_threshold: float,  # confidence target
    fallback_policy: ResolutionPolicy,
)

4.3 `capabilities/` — Atomic Operations¶

Reusable, versioned, metadata-declared operations. The V1 surface is deliberately small.

V1 capabilities:

Capability ID	Purpose	Deterministic?
`text_extraction`	Pull plain text from raw input (PDF, image, HTML)	No (provider-dependent)
`regex_extraction`	Pattern-based local extraction	Yes
`lookup`	Structured / retrieval-based extraction	Yes (deterministic backend) or No (vector backend)
`inference`	Model-backed extraction. LLMs are providers, not the capability.	No (model-dependent)
`validation`	Verify a candidate value against a constraint	Yes

Capability contract (informal; see PACKAGE_STRUCTURE.md §capabilities):

CapabilitySpec(
    id: str,
    version: str,
    input_type: type,
    output_type: type,
    cost_estimate: CostHint,        # tokens, ms, $
    deterministic: bool,
    requires: list[str],            # required provider classes
)

CapabilityResult(
    candidates: list[Candidate],
    evidence: list[EvidenceRef],
    diagnostics: list[Diagnostic],
)

Critical rule: capabilities never assign confidence. See ADR-0005.

4.4 `executor/` — Deterministic Runner¶

Runs the plan exactly as the Planner defined it. No replanning, no rerouting, no structural retries. Stops early when a field hits its confidence target.

Inputs: ExecutionPlan, CanonicalContract, CapabilityRegistry, InputData. Outputs: CandidateResult[] — one per FieldPlan, with raw candidates and evidence.

Responsibilities:

Walk FieldPlans in order.
Invoke each capability in the field's capability_chain.
Pass context forward (e.g., "supplier_name was already resolved").
Stop early when confidence target is reached.
Return UNRESOLVED candidates when the chain is exhausted without meeting the threshold.

The Executor never assigns final confidence; it only collects candidate evidence.

4.5 `reconciler/` — Truth Resolution¶

First-class subsystem. The only place that assigns final confidence and final truth.

Inputs: CandidateResult[], CanonicalContract. Outputs: ResolvedResult[] — one per field, with final value, final confidence, and evidence_refs[].

Three truth layers:

Contract Truth (what the caller requires)
        ↓
Candidate Truth (what capabilities discovered)
        ↓
Resolved Truth (what the Reconciler accepts into the artifact)

Responsibilities:

Merge candidate values (union, intersection, prefer-by-evidence).
Detect conflicts between candidates.
Compare evidence quality.
Assign final confidence (float 0.0–1.0) and confidence band.
Resolve final truth.
Decide when a field is UNRESOLVED vs PARTIAL_SUCCESS.

The Reconciler never executes capabilities and never reads raw input.

4.6 `artifact/` — The Product¶

The final output bundle. The only replay source.

Inputs: ResolvedResult[], ExecutionPlan, evidence, diagnostics, statistics, configuration. Outputs: ExecutionArtifact (JSON-serializable).

Contains:

normalized_data — the resolved output matching the contract shape.
field_results — FieldResult[] with status, value, confidence, evidence_refs.
unresolved_fields — explicit list of fields the engine could not resolve.
evidence — provenance records (capability, source, span, model id, etc.).
diagnostics — structured warnings and notes.
execution_plan — the FieldPlan[] that was executed.
replay_hash — deterministic signature over contract + plan + capability versions + configuration.
statistics — token counts, capability invocations, latency, cost.

Statuses: SUCCESS, PARTIAL_SUCCESS, UNRESOLVED, INVALID_CONTRACT, EXECUTION_FAILED. See GLOSSARY.md for definitions.

4.7 `api/` — Public Surface¶

The only thing users see. Tiny, stable, versioned.

import paxman

result = paxman.normalize(
    input_data=...,
    contract=...,
    budget=...,
    policy=...,
)

rehydrated = paxman.replay(artifact, contract=...)

Stability rules:

Public API is whatever is re-exported from paxman/__init__.py.
Subsystem names, FieldPlan, CapabilitySpec, TruthLayer do not leak.
CI enforces the public surface: a test_public_api.py fails if anything new is added without an ADR.

5. Internal Module Layout¶

paxman_core/
├── contract/
│   ├── canonical.py
│   ├── validator.py
│   ├── semantics.py
│   └── adapters/
│       ├── pydantic.py
│       ├── json_schema.py
│       ├── dict_dsl.py
│       └── openapi.py
│
├── planner/
│   ├── planner.py
│   ├── heuristics.py
│   ├── scoring.py
│   ├── policies.py
│   └── field_plan.py
│
├── capabilities/
│
├── executor/
│
├── reconciler/
│
├── artifact/
│
└── api/

For module-level details, see PACKAGE_STRUCTURE.md.

6. Error Model¶

6.1 Status codes (artifact-level)¶

Status	Meaning	Caller action
`SUCCESS`	All required fields resolved with acceptable confidence	Consume `normalized_data`
`PARTIAL_SUCCESS`	Some required fields resolved; some `UNRESOLVED`	Inspect `unresolved_fields`; decide whether to retry or accept
`UNRESOLVED`	No required field reached the confidence threshold	Reject or retry with a stronger budget
`INVALID_CONTRACT`	The contract failed validation	Fix the contract; do not retry
`EXECUTION_FAILED`	An unrecoverable error occurred during execution	Inspect `diagnostics`; do not blindly retry

6.2 Error hierarchy (Python exceptions)¶

PaxmanError (base)
├── InvalidContractError
│     ├── UnsupportedFieldTypeError
│     ├── InvalidConstraintError
│     ├── InvalidPathError
│     └── InvalidSemanticTagError
├── ExecutionError
│     ├── CapabilityError
│     │     └── InferenceProviderError
│     ├── BudgetExceededError
│     └── ReconciliationError
├── ReplayError
│     ├── VersionMismatchError
│     └── HashMismatchError
└── ConfigurationError
      ├── InvalidBudgetError
      └── InvalidPolicyError

Each exception carries an error_code (string from §6.1) and a structured context dict for logging and tracing. This mirrors the Pydantic / instructor pattern of a type code + loc + msg + ctx.

6.3 Status vs exception: when to use which¶

Exception — the failure is unrecoverable and the caller must handle it. Examples: INVALID_CONTRACT, EXECUTION_FAILED for capability crashes.
Status — the failure is "expected" (e.g., a field cannot be resolved) and is encoded into the artifact. Examples: UNRESOLVED, PARTIAL_SUCCESS.

A BudgetExceededError is an exception because the caller violated a contract constraint. An UNRESOLVED field is a status because it is the engine's honest report of what it could and could not do.

7. Configuration Model¶

Paxman takes three explicit configuration objects at the call site:

result = paxman.normalize(
    input_data=...,
    contract=...,
    budget=Budget(
        max_total_cost_usd=Decimal("0.10"),  # Decimal per ADR-0004 / ADR-0010
        max_total_latency_ms=5_000,
        max_remote_inference_calls=2,
    ),
    policy=Policy(
        allow_remote_inference=True,
        allow_local_inference=True,
        confidence_floor=0.80,
        unresolved_acceptable=False,
    ),
)

7.1 `Budget` (hard limits)¶

Field	Type	Meaning
`max_total_cost_usd`	`Decimal \\| None`	Hard cap on cost in USD; aborts the run when exceeded. Constructor accepts `float \\| int \\| Decimal` and coerces to `Decimal` (MONEY is Decimal per ADR-0004 / ADR-0010).
`max_total_latency_ms`	`int \\| None`	Hard cap on wall-clock latency.
`max_remote_inference_calls`	`int \\| None`	Cap on remote inference invocations.
`max_capability_invocations`	`int \\| None`	Cap on total capability invocations.

When any budget is exceeded, the artifact is returned with status PARTIAL_SUCCESS and a BudgetExceededError is logged in diagnostics. (V1: the behavior on budget exceeded is configurable; default is to short-circuit and return what was resolved.)

7.2 `Policy` (soft preferences)¶

Field	Type	Meaning
`allow_remote_inference`	`bool`	If `False`, the planner excludes step 6 of the heuristic.
`allow_local_inference`	`bool`	If `False`, the planner excludes step 5.
`confidence_floor`	`float`	Minimum confidence to mark a field `SUCCESS`; below this is `PARTIAL_SUCCESS`.
`unresolved_acceptable`	`bool`	If `False`, the artifact status is `UNRESOLVED` when any required field is unresolved.
`currency_policy`	`CurrencyPolicy \\| None`	For `MONEY` fields: behavior on cross-currency arithmetic.

7.3 `ContractPolicy` (per-contract)¶

Set on the CanonicalContract itself; overrides call-site Policy for that contract. Example use: "this contract's tax_amount field may never be inferred; it must be explicit or UNRESOLVED."

8. Truth Resolution Model¶

Paxman operates on three explicit truth layers:

Contract Truth (what the caller requires — frozen at validation)
        ↓
Candidate Truth (what capabilities discover — mutable, may be empty)
        ↓
Resolved Truth (what the Reconciler accepts into the artifact — frozen at emission)

Key invariants:

The Reconciler is the only subsystem that mutates Candidate Truth into Resolved Truth.
The Planner reads Contract Truth to plan and the Reconciler reads Contract Truth to validate; the Planner does not write Resolved Truth.
Confidence is assigned only in the Reconciler. The Planner may emit a target_confidence (the field's confidence_threshold) but never assigns confidence to a candidate. See ADR-0005.

9. Versioning Strategy¶

Paxman has four versioned dimensions:

Dimension	Format	Source of truth	Bump triggers
Library version	`MAJOR.MINOR.PATCH` (semver)	`pyproject.toml`	API breakage, deprecation, new features
Planner version	Embedded in artifact	Internal constant	Algorithm change that affects the plan
Capability version	`<capability_id>@<semver>`	Capability registry	Capability input/output/cost change
Contract schema version	`<adapter>:<version>` (e.g., `pydantic:2`, `json_schema:draft-2020-12`)	Adapter	Source format change

9.1 Library version policy (semver)¶

MAJOR (1.0 → 2.0) — breaking change to public API, artifact format, or replay semantics.
MINOR (1.0 → 1.1) — new feature, new optional dependency, new capability, backward compatible.
PATCH (1.0.0 → 1.0.1) — bug fix, perf, doc, no public API change.

9.2 What is and is not a breaking change¶

Following Pydantic's published policy as a model:

NOT a breaking change in MINOR:

Adding new error codes.
Adding new fields to error responses.
Changing error message text (use error_code for programmatic handling).
Adding a new optional configuration parameter.
Adding a new optional capability.
Adding a new optional adapter.

IS a breaking change in MAJOR:

Removing a public API method.
Changing a public method signature.
Removing an error code.
Changing the artifact JSON shape for a field that is already emitted.
Changing replay semantics such that the same artifact + version no longer reproduces.

9.3 Pre-1.0¶

Before 1.0, MINOR versions may contain breaking changes. The current target is 1.0 when:

All 9 success metrics in PRD §9 are met or explicitly waived.
All 8 V1 acceptance criteria in PRD §10 are met.

9.4 Capability versioning¶

Capabilities are independently versioned. A capability is referenced in FieldPlan as <id>@<version>. The planner may pin a major version. Replay checks the pinned versions.

9.5 Replay compatibility matrix¶

A new Paxman version can replay an old artifact only if the artifact's planner version and capability versions are supported by the new Paxman. Mismatches raise VersionMismatchError (replay) or HashMismatchError (rehydration). See REPLAY_AND_DETERMINISM.md for the full replay model.

10. Testing Architecture¶

See TESTING_STRATEGY.md for the full strategy. This section lists the architectural seams.

10.1 Test seams¶

Subsystem	Seam	How to test in isolation
`contract/`	Adapter + Validator as pure functions	Inject fixture contracts; assert `CanonicalContract`
`planner/`	`plan(canonical, input_profile, budget, policy, registry) → ExecutionPlan`	Inject a fake `CapabilityRegistry`
`capabilities/`	Each capability as a `Protocol`	Mock the provider; assert `CapabilityResult`
`executor/`	`run(plan, contract, registry, input) → CandidateResult[]`	Mock all capabilities; assert invocation order
`reconciler/`	`reconcile(candidates, contract) → ResolvedResult[]`	Feed crafted candidates; assert final truth
`artifact/`	`build(resolved, plan, evidence) → ExecutionArtifact`	Build from fixtures; assert replay_hash determinism

10.2 Determinism tests¶

For every subsystem, two tests:

Property test — given the same inputs, the same outputs (Hypothesis).
Replay test — given a fixture artifact, rehydrate produces the same JSON hash.

10.3 End-to-end fixtures¶

A curated set of (input, contract, expected_artifact) fixtures that exercise every capability and adapter. The fixtures are checked into the repo and used in CI.

10.4 Coverage¶

≥ 90% line coverage on contract/, planner/, executor/, reconciler/.
100% coverage on errors.py and versioning.py.

11. Concurrency Model¶

11.1 V1¶

paxman.normalize() is synchronous and not thread-safe within a single process. Callers must serialize calls.
The Executor runs field plans sequentially in a deterministic order. No parallel field execution.
Capabilities may be thread-safe internally (their choice); Paxman does not require it.

11.2 Future (V2)¶

Parallel field execution is permitted only if the planner proves that the fields are independent (no derived computation dependencies).
A Policy.parallelism knob would let callers opt in.
Async API (async def normalize) is V2.

This decision is to keep V1 deterministic and replayable end-to-end without a complex scheduler. See ADR-0006.

12. Observability¶

12.1 What Paxman emits¶

Structured events — at the start of planning, per capability invocation, per field resolution, and at artifact emission.
Diagnostics — encoded into the artifact (e.g., "skipped remote inference because policy.allow_remote_inference=False").
Counters — capability invocations, tokens, cost, latency per call.

12.2 What Paxman does NOT emit¶

Raw input is never written to logs or telemetry by default.
Inference prompts and completions are not emitted by default; they are recorded in evidence only when the caller opts in.

12.3 Determinism-safe logging¶

Logs are emitted via structlog (or an injected logger) with no timestamps in the replay path.
Clock reads are injected; in tests, a fixed clock is used.
Random number generators are injected; the planner does not use them.

This allows the same execution to produce the same artifact and the same logs.

12.4 Metrics (non-replay path)¶

For non-replay consumers (production monitoring), Paxman emits:

paxman_normalize_total{status=...} — counter
paxman_normalize_duration_seconds — histogram
paxman_capability_invocations_total{capability=...,version=...} — counter
paxman_replay_total{status=...} — counter

Metric emission is opt-in via Policy.emit_metrics: bool (default False).

13. Security and PII Model¶

See SECURITY.md for the full threat model. This section is the architectural summary.

13.1 Data handling defaults¶

Data type	Default behavior	Override
Raw input	Held in memory only; never written to logs or artifacts	`Policy.log_raw_input: bool` (default `False`)
Inference prompts	Held in memory only; not in artifacts	`Policy.record_inference_io: bool` (default `False`)
Inference completions	Same as prompts	Same
PII	Caller's responsibility to redact or sanitize	No Paxman auto-redaction in V1
Provider secrets	Passed by reference (env var name or secret store handle)	Never embedded in artifacts
Evidence	Stored as references (span offsets, capability id, source identifier)	`Policy.embed_evidence_payload: bool` (default `False`)

13.2 Prompt-injection posture¶

The Inference capability accepts arbitrary completion text. The Reconciler treats inference output as untrusted until validated by the Validation capability or a downstream ValidationPolicy. No inference output is ever trusted as a final value without validation. This is enforced at the Reconciler, not the capability.

13.3 Multi-tenant posture¶

Paxman is a library. It does not enforce tenant isolation. The caller (e.g., a SaaS wrapper) is responsible for routing inputs and artifacts to the right tenant store.

14. Performance and SLOs¶

Paxman does not commit to formal SLOs in V1. The following are aspirational targets for measurement only:

Operation	Target p50	Target p99	Notes
`paxman.normalize()` — 20-field contract, 100 KB input, no remote inference	≤ 200 ms	≤ 2 s	Cold process
`paxman.replay()` — 100 KB artifact	≤ 50 ms	≤ 500 ms	Rehydration only
Cold import + capability registration	≤ 100 ms	≤ 500 ms	At interpreter start
`CanonicalContract` construction (Pydantic adapter)	≤ 50 ms	≤ 500 ms	For a 50-field model

14.1 Profiling hooks¶

V1 ships:

A Profile event hook that records per-capability latency.
A BudgetTracker that records per-capability cost.
A Clock injection point for deterministic tests.

14.2 What is NOT a perf guarantee¶

Inference latency (provider-dependent).
Network round-trips.
Adapter time for very large contracts (>10,000 fields).

15. Extension Model¶

See EXTENDING.md for the step-by-step guide. This section is the architectural summary.

15.1 Adding a new contract adapter¶

Implement ContractAdapter protocol: def adapt(self, external_contract) -> CanonicalContract and def export(self, canonical) -> ExternalFormat.
Register with CapabilityRegistry (or ContractAdapterRegistry).
Add a V1 contract adapter set entry.

15.2 Adding a new capability¶

Implement Capability protocol: def invoke(self, ctx) -> CapabilityResult and def spec(self) -> CapabilitySpec.
Register with the global CapabilityRegistry.
Add to the planner's heuristic ordering (or define a new strategy) via ADR.

15.3 Adding a new inference provider¶

Implement InferenceProvider protocol: def complete(self, prompt, model) -> Completion.
Register as the provider for the inference capability.
The provider is not a capability — capabilities remain provider-agnostic.

15.4 Adding a new policy or budget field¶

This is a public API change and requires a new ADR and a MINOR (or MAJOR, if breaking) version bump.

16. Migration and Upgrade Story¶

16.1 Replay across Paxman versions¶

An artifact recorded under Paxman 1.0 is replayable on Paxman 1.1.x (within the same major).
A Paxman 1.x artifact replayed on Paxman 2.0 raises VersionMismatchError with a clear message; the caller must regenerate the artifact.
The replay-hash includes the planner version; the executor checks the planner version before rehydrating.

16.2 Capability upgrades¶

A capability at <id>@1.0 is replaced by <id>@1.1 only on explicit registration. Old plans still pin the old version.
Old plans replay using the old capability version; if the old version is no longer registered, replay raises CapabilityNotFoundError.

16.3 Contract adapter upgrades¶

A contract adapter can be upgraded independently. Adapters are responsible for forward-compatibility within a major.

16.4 Deprecation policy¶

Public API: deprecate in MINOR, remove in next MAJOR.
Internal API: no deprecation cycle; rename freely between MAJORs.
ADRs document every deprecation and removal.

17. V1 Scope¶

17.1 What to ship in V1¶

Contract Adapters: Pydantic, JSON Schema, Dict DSL (required); OpenAPI (optional, best-effort).
Contract Validator.
Canonical Contract Model.
Field-Centric Planner (rule-based).
Executor (sequential).
Reconciler.
Artifact Builder.
5 Capabilities: text_extraction, regex_extraction, lookup, inference, validation.
Replay hash + replay rehydration.
1 reference inference provider (stub or local).

17.2 What to postpone to V2¶

Capability marketplace.
Visual planners.
Graph execution.
LLM planners.
Workflow orchestration.
Persistent execution.
RAG subsystems.
Multi-agent coordination.
Parallel field execution.
Async API.
Distributed tracing export.

18. Data Flow Diagram¶

       Caller
         │
         │  input_data + contract + budget + policy
         ▼
   ┌─────────────┐
   │ api.normalize│
   └──────┬──────┘
          │
          ▼
   ┌──────────────────┐
   │ contract.adapt   │ ──▶ INVALID_CONTRACT (exception)
   │ contract.validate│
   └──────┬───────────┘
          │  CanonicalContract
          ▼
   ┌──────────────────┐
   │ input.profile    │ (lightweight classification)
   └──────┬───────────┘
          │  InputProfile
          ▼
   ┌──────────────────┐
   │ planner.plan     │
   └──────┬───────────┘
          │  ExecutionPlan (FieldPlan[])
          ▼
   ┌──────────────────┐
   │ executor.run     │
   │  ├── capability₁ │
   │  ├── capability₂ │ ──▶ candidates + evidence
   │  └── ...         │
   └──────┬───────────┘
          │  CandidateResult[]
          ▼
   ┌──────────────────┐
   │ reconciler.recon │
   │  merge, conflict,│ ──▶ resolved truth + confidence
   │  confidence      │
   └──────┬───────────┘
          │  ResolvedResult[]
          ▼
   ┌──────────────────┐
   │ artifact.build   │
   │  serialize, hash │
   └──────┬───────────┘
          │  ExecutionArtifact (with replay_hash)
          ▼
        Caller

19. Open Architectural Questions¶

These are tracked in PRD §13 as open questions for the V1 cycle. Architectural-level questions:

Q-A1 Should the Planner emit a target_confidence for each FieldPlan (yes, per §4.2) or only on the contract field? (Resolved: per-field on the contract; the planner reads it.)
Q-A2 Is a ResolutionPolicy per field or per contract? (Resolved: per field; the contract carries a default.)
Q-A3 Should MONEY arithmetic be a capability or a Reconciler primitive? (Open — likely a Reconciler primitive; see ADR-0004.)
Q-A4 How are conflicting UNRESOLVED reasons represented? (Open — design candidate: list of (capability_id, reason_code, message).)
Q-A5 Does the Executor support capability-level cancellation? (Open — V1: no; cancel = abort the run.)

20. References¶

PACKAGE_STRUCTURE.md — Module layout, dependency DAG, public/private split.
GLOSSARY.md — Full domain vocabulary.
REPLAY_AND_DETERMINISM.md — Replay model deep dive.
SECURITY.md — Threat model and PII handling.
TESTING_STRATEGY.md — Test seams and determinism tests.
DEVELOPMENT.md — Local dev setup.
EXTENDING.md — How to add a capability, adapter, or provider.
DEPENDENCIES.md — Core vs optional dependencies.
docs/adr/ — Architecture Decision Records.

Paxman Architecture¶

1. Architecture Overview¶

2. High-Level Flow¶

3. Core Architectural Principles¶

3.1 Contract externalization¶

3.2 Deterministic planning¶

3.3 Isolation of concerns¶

3.4 Field-centric, not document-centric¶

3.5 Capability confidence rule¶

3.6 The pipeline is synthesized, not fixed¶

4. Subsystem Specification¶

4.1 contract/ — Translation + Validation Boundary¶

4.2 planner/ — Field-Centric Plan Synthesis¶

4.3 capabilities/ — Atomic Operations¶

4.4 executor/ — Deterministic Runner¶

4.5 reconciler/ — Truth Resolution¶

4.6 artifact/ — The Product¶

4.7 api/ — Public Surface¶

5. Internal Module Layout¶

6. Error Model¶

6.1 Status codes (artifact-level)¶

6.2 Error hierarchy (Python exceptions)¶

6.3 Status vs exception: when to use which¶

7. Configuration Model¶

7.1 Budget (hard limits)¶

7.2 Policy (soft preferences)¶

7.3 ContractPolicy (per-contract)¶

8. Truth Resolution Model¶

9. Versioning Strategy¶

9.1 Library version policy (semver)¶

9.2 What is and is not a breaking change¶

9.3 Pre-1.0¶

9.4 Capability versioning¶

9.5 Replay compatibility matrix¶

10. Testing Architecture¶

10.1 Test seams¶

10.2 Determinism tests¶

10.3 End-to-end fixtures¶

10.4 Coverage¶

11. Concurrency Model¶

11.1 V1¶

11.2 Future (V2)¶

12. Observability¶

12.1 What Paxman emits¶

12.2 What Paxman does NOT emit¶

12.3 Determinism-safe logging¶

12.4 Metrics (non-replay path)¶

13. Security and PII Model¶

13.1 Data handling defaults¶

13.2 Prompt-injection posture¶

13.3 Multi-tenant posture¶

14. Performance and SLOs¶

14.1 Profiling hooks¶

14.2 What is NOT a perf guarantee¶

15. Extension Model¶

15.1 Adding a new contract adapter¶

15.2 Adding a new capability¶

15.3 Adding a new inference provider¶

15.4 Adding a new policy or budget field¶

16. Migration and Upgrade Story¶

16.1 Replay across Paxman versions¶

16.2 Capability upgrades¶

16.3 Contract adapter upgrades¶

16.4 Deprecation policy¶

17. V1 Scope¶

17.1 What to ship in V1¶

17.2 What to postpone to V2¶

18. Data Flow Diagram¶

19. Open Architectural Questions¶

20. References¶

4.1 `contract/` — Translation + Validation Boundary¶

4.2 `planner/` — Field-Centric Plan Synthesis¶

4.3 `capabilities/` — Atomic Operations¶

4.4 `executor/` — Deterministic Runner¶

4.5 `reconciler/` — Truth Resolution¶

4.6 `artifact/` — The Product¶

4.7 `api/` — Public Surface¶

7.1 `Budget` (hard limits)¶

7.2 `Policy` (soft preferences)¶

7.3 `ContractPolicy` (per-contract)¶