How to Add a New Capability¶
Status: V1 Audience: Paxman users who want to add a new atomic extraction/validation operation (OCR, barcode decoding, custom regex library, …). Related docs: EXTENDING.md §2 (the full SPI walkthrough), docs/concepts/capabilities.md (what a capability is), ADR-0005 (why capabilities don't assign confidence).
This guide is a focused quick-start for adding a new capability to Paxman. The full SPI walkthrough is in EXTENDING.md §2; this document is a 5-minute checklist.
1. When to add a new capability¶
Add a new capability when:
- You have a domain-specific extraction or validation step that doesn't fit the V1 surface.
- You want to ship a new algorithm (e.g. a custom regex library, a custom lookup table, a custom model wrapper).
- You are shipping a new field type that the V1 capabilities cannot handle.
Do not add a new capability when:
- The operation can be expressed as a V1 capability (e.g. don't add
phone_extraction— useregex_extractionwith a phone regex). - The operation is a "policy" rather than an operation (use
ResolutionPolicyorContractPolicyinstead). - The operation needs to read the contract or the raw input directly
(it must go through
CapabilityContext).
2. The Capability SPI¶
from typing import Protocol
from paxman.protocols import CapabilitySpec, CapabilityContext, CapabilityResult
class Capability(Protocol):
"""SPI: an atomic operation."""
@property
def spec(self) -> CapabilitySpec:
"""Metadata describing the capability."""
...
def invoke(self, ctx: CapabilityContext) -> CapabilityResult:
"""Run the capability on the given context.
Returns:
CapabilityResult with candidates, evidence_refs, and diagnostics.
MUST NOT include a `confidence` field.
Raises:
CapabilityError: if the capability fails to run.
"""
...
CapabilityResult is frozen, slotted; it has candidates,
evidence_refs, and diagnostics — and no confidence field
(per ADR-0005).
3. Step-by-step¶
3.1 Pick a capability id and version¶
Choose a stable id (e.g. "ocr", "barcode", "date_parser").
Choose a semver version (e.g. "1.0", "1.0.1"). The registry
is keyed on (id, version); the planner picks the highest
registered version.
3.2 Declare the CapabilitySpec¶
The spec tells the planner what the capability does, its
FieldType in/out, its tier, and its cost:
import attrs
from decimal import Decimal
from paxman.capabilities.base import Capability, CapabilityContext
from paxman.capabilities.result import (
CapabilityResult,
Candidate,
Diagnostic,
DiagnosticCode,
DiagnosticSeverity,
EvidenceRef,
)
from paxman.capabilities.spec import CapabilitySpec, CapabilityTier, CostHint
from paxman.types import FieldType
@attrs.frozen(slots=True)
class DateParserCapability:
"""Parse a date string from a CapabilityContext."""
@property
def spec(self) -> CapabilitySpec:
return CapabilitySpec(
id="date_parser",
version="1.0",
input_type=FieldType.STRING,
output_type=FieldType.DATE,
tier=CapabilityTier.LOCAL_DETERMINISTIC,
cost_estimate=CostHint(
usd=Decimal("0.0"),
ms=10,
invocations=1,
tokens=0,
),
deterministic=True,
required_providers=(),
)
def invoke(self, ctx: CapabilityContext) -> CapabilityResult:
try:
date_value = parse_date(ctx.raw_input)
except ParseError as e:
return CapabilityResult(
candidates=(),
evidence=(),
diagnostics=(Diagnostic(
code=DiagnosticCode.PATTERN_NO_MATCH,
severity=DiagnosticSeverity.WARNING,
message=str(e),
context={"input": ctx.raw_input},
),),
)
return CapabilityResult(
candidates=(Candidate(value=date_value, evidence_refs=(), diagnostics=()),),
evidence=(
EvidenceRef(
capability_id="date_parser",
capability_version="1.0",
span=ctx.span,
field_path=ctx.field_path,
),
),
diagnostics=(),
)
The CostHint is a deterministic upper bound for the
capability's cost. The Planner uses it to score the capability and
to pre-flight budget gates.
3.3 Be stateless¶
The capability must be stateless across invocations. Each call
must produce the same result for the same input (unless
spec.deterministic=False).
Tests:
def test_date_parser_capability_is_stateless():
cap = DateParserCapability()
ctx = CapabilityContext(input_text="2026-01-15", ...)
a = cap.invoke(ctx)
b = cap.invoke(ctx)
assert a.candidates == b.candidates
3.4 Capture external effects in evidence¶
If the capability calls an external service, record the call (provider, model, prompt hash, completion hash) in evidence:
evidence_refs=(
EvidenceRef(
capability_id="inference",
capability_version="1.0",
span=ctx.span,
field_path=ctx.field_path,
extras={"provider": "openai", "model": "gpt-5", "prompt_hash": "..."},
),
),
This is critical for replay: the recorded evidence is what
paxman.replay() rehydrates from. Without it, replay would have to
re-invoke the capability, breaking determinism.
3.5 Register the capability¶
Registering with an already-registered (id, version) raises
InvalidContractError.
3.6 Write tests¶
At minimum:
- Happy path — one test per
input_typeyou support. - Edge cases — empty input, malformed input, very long input.
- Stateless — same input → same output across calls.
- Determinism flag —
spec.deterministic=Truefor any capability backed by a pure function. - Evidence —
result.evidenceis non-empty when the capability produces a candidate.
from datetime import date
from paxman.capabilities.base import CapabilityContext
from paxman.types import FieldType
def test_date_parser_capability_handles_iso_format():
cap = DateParserCapability()
ctx = CapabilityContext(
raw_input=b"2026-01-15",
field_path="issue_date",
field_type_name=FieldType.DATE.value,
)
result = cap.invoke(ctx)
assert len(result.candidates) == 1
assert result.candidates[0].value == date(2026, 1, 15)
Use paxman.testing.capability_contexts() for property tests.
3.7 Distribute¶
If your capability is a new public SPI surface for the Paxman
core, you need an ADR (see docs/adr/README.md).
If you are publishing as a separate PyPI package
(paxman-<your-capability>), you do not need an ADR for the
Paxman core repo, but the extension should document its SPI
compliance.
4. What capabilities MUST do¶
- Return
CapabilityResultwithcandidates,evidence_refs, anddiagnostics. - Be stateless — no mutable state across invocations.
- Declare a
CapabilitySpecwith input/output, cost, determinism, and required providers. - Capture external effects in evidence — if the capability calls an external service, record the call as evidence.
- Fail loudly on unrecoverable errors via
CapabilityError.
5. What capabilities MUST NOT do¶
- Assign confidence. Capabilities return candidates; the Reconciler assigns confidence. See ADR-0005.
- Read the canonical contract directly — capabilities receive a
CapabilityContext. - Read the raw input directly — they receive an opaque
InputDatahandle viaCapabilityContext. - Mutate the executor state.
6. The full SPI walkthrough¶
For the full SPI walkthrough (including a longer example, the
CostHint semantics, and a worked inference-style capability), see
EXTENDING.md §2.
7. See also¶
- EXTENDING.md §2 — full SPI walkthrough.
- docs/concepts/capabilities.md — what a capability is in Paxman.
- ADR-0005 — confidence ownership (why capabilities don't assign confidence).
- docs/specs/capability-cost-model.md —
the
CostHintand the scoring formula. - paxman.capabilities.spec — the
CapabilitySpecdata model.