Replay¶
Status: V1 Audience: Paxman users storing and rehydrating artifacts; Paxman contributors extending the artifact subsystem. Related docs: REPLAY_AND_DETERMINISM.md (the full deep dive), GLOSSARY.md §Replay, ARCHITECTURE.md §8 Artifact Subsystem.
Replay is the act of rehydrating a previously produced
ExecutionArtifact to its original state without recomputation.
It is the only mechanism for rehydrating an artifact. Paxman
never recomputes from a replayed artifact.
This document explains the replay hash, the replay protocol, version compatibility, and what determinism does and does not guarantee. For the full deep dive, see REPLAY_AND_DETERMINISM.md.
1. Definitions¶
- Determinism — a property of the planner and executor:
given the same inputs, the planner produces the same
ExecutionPlanand the executor runs the same capabilities in the same order. - Reproducibility — a property of the artifact: given the same Paxman version, the same inputs, and the same capability versions, the artifact is byte-equal.
- Replay — the act of rehydrating a previously produced
ExecutionArtifactto its original state without recomputation.
Determinism does not require capabilities to be deterministic. A
non-deterministic capability (e.g. inference backed by a remote
LLM) may produce different output for the same input; Paxman records
the actual output as evidence and the replay path does not
re-invoke the capability.
Reproducibility requires determinism plus recorded capability outputs. Replay is the only mechanism for rehydrating an artifact.
2. The replay hash¶
The replay_hash is a deterministic signature over the inputs that
uniquely identify an artifact's content.
2.1 What goes into the hash¶
The hash is computed over a concatenation (with | separator) of
the following fields, each serialised with the stable JSON encoder
(paxman.serialization.stable_dumps — RFC 8785-style: sorted keys,
no whitespace, canonical number formatting):
| # | Field | Source |
|---|---|---|
| 1 | paxman_version |
paxman.__version__ |
| 2 | planner_version |
paxman.versioning.PLANNER_VERSION (the literal string "1") |
| 3 | replay_version |
paxman.versioning.REPLAY_VERSION (constant "1") |
| 4 | execution_plan |
The plan's deterministic JSON (or "null" if absent) |
| 5 | field_results |
Per-field resolved results, sorted by field_path |
| 6 | evidence |
The artifact's evidence list |
| 7 | diagnostics |
The artifact's diagnostics list |
| 8 | statistics |
The artifact's statistics |
| 9 | contract_id |
The contract's stable id (string) |
| 10 | capability_versions |
The capability → version map (keys sorted) |
The hash is computed as sha256("|".join(parts)).hexdigest() — a
64-character lowercase hex string. It is deterministic: the
same artifact produces the same hash.
Fields that are intentionally excluded from the hash (because they may legitimately differ across rehydrations):
id— the artifact's per-run unique identifier.created_at— wall-clock timestamp.metadata— caller-supplied annotations.
2.2 What does NOT go into the hash¶
- Wall-clock timestamps (determinism-safe: no clock in the replay path).
- Random number generator state.
- The Paxman process ID.
- The order in which fields appear in a Python dict (canonical JSON normalizes this).
2.3 Hash algorithm¶
V1 uses SHA-256. The hash is hex-encoded in the artifact.
2.4 What the hash protects against¶
The hash detects any modification to the artifact's content:
- Changing a resolved value.
- Removing or adding an evidence reference.
- Changing a confidence score.
- Removing a diagnostic.
- Changing the planner or capability version.
Any change produces a different hash. The replay path recomputes
the hash and compares; mismatch raises HashMismatchError.
2.5 What the hash does NOT protect against¶
- A re-signing attack: if an attacker has full control of the artifact, they can recompute the hash. Paxman is a library; the caller is responsible for storing the artifact securely.
- Time-of-creation metadata: the artifact does not embed creation time (deliberate, for determinism).
3. The replay protocol¶
Caller
│
│ artifact: ExecutionArtifact
│ contract: same contract used originally (caller-supplied)
▼
paxman.replay(artifact, contract)
│
├── 1. Type check
│ - artifact must be a non-None ExecutionArtifact instance
│
├── 2. Version check (paxman_version)
│ - artifact.paxman_version must be same major + not newer than
│ the running library; else → VersionMismatchError
│
├── 3. Capability check
│ - every entry in artifact.capability_versions must be in the
│ capability registry; else → CapabilityNotFoundError
│
├── 4. Hash check
│ - recompute the replay_hash over the artifact's hash-relevant
│ fields (see §2.1) and compare; mismatch → HashMismatchError
│
└── 5. Return the rehydrated ExecutionArtifact (byte-equal to input)
Replay is read-only in the strict sense: it does not invoke any
capability, planner, executor, or reconciler. It is pure
deserialization. The contract is supplied by the caller and adapted
to a CanonicalContract; Paxman does not re-validate the contract
during replay (V1) — the contract's id is part of the hash
inputs, so any tampering surfaces as a HashMismatchError.
4. Version compatibility¶
The replay path enforces version compatibility along three dimensions:
| Dimension | Strictness | Behavior on mismatch |
|---|---|---|
paxman_version |
Same major version (semver) | VersionMismatchError |
planner_version |
Must be supported by current Paxman | VersionMismatchError |
capability_versions |
All pinned capabilities must be in the registry | CapabilityNotFoundError |
4.1 Cross-major replay¶
Cross-major replay (e.g. Paxman 1.x artifact on Paxman 2.0) raises
VersionMismatchError. The caller must regenerate the artifact
under the new major.
4.2 Cross-minor replay¶
Cross-minor replay (e.g. Paxman 1.0 artifact on Paxman 1.5) must
succeed if the artifact's planner version and capability versions
are still supported. If the new Paxman has dropped support for a
planner version or capability version, it raises
VersionMismatchError (or CapabilityNotFoundError).
4.3 Cross-patch replay¶
Cross-patch replay is always allowed.
5. Determinism guarantees¶
Paxman guarantees the following:
5.1 What is deterministic¶
- The Planner output (
ExecutionPlan) is deterministic given the same canonical contract, input profile, configuration, capability registry, budget, and policy. - The Executor invocation order is deterministic given the same
ExecutionPlan. - The Reconciler output (
ResolvedResult[]) is deterministic given the sameCandidateResult[]andCanonicalContract. - The Artifact is deterministic given the same
ResolvedResult[],ExecutionPlan, evidence, and configuration. - The replay hash is deterministic given the same inputs.
5.2 What is NOT deterministic¶
- Capability outputs from non-deterministic capabilities (e.g.
inferencebacked by a remote LLM) may differ between runs. Paxman records the actual output as evidence and the replay path does not re-invoke the capability. - Wall-clock latency.
- Cost (depends on provider pricing at the time of the call).
5.3 Determinism in the presence of non-deterministic capabilities¶
Paxman does not require capabilities to be deterministic. A non-deterministic capability is allowed, and its output is recorded as evidence. The artifact includes the recorded output, so replay reproduces the same artifact without re-invoking the capability.
This means:
- A Paxman run with only deterministic capabilities is fully reproducible (same inputs → same artifact, byte-equal).
- A Paxman run with non-deterministic capabilities is replayable (the recorded artifact rehydrates) but not reproducible (re-running the same inputs may produce a different artifact because the non-deterministic capability produced different output).
Both cases satisfy the PRD's determinism requirement: "The same input, contract, version set, and execution constraints must yield the same plan and replayable result." (PRD §4.5).
6. Property tests¶
Determinism is verified by Hypothesis property tests. Examples:
@given(contract=contracts(), input_data=inputs(), budget=budgets(), policy=policies())
def test_planner_is_deterministic(contract, input_data, budget, policy):
plan_a = planner.plan(contract, profile(input_data), budget, policy, registry)
plan_b = planner.plan(contract, profile(input_data), budget, policy, registry)
assert serialize(plan_a) == serialize(plan_b)
@given(artifact=artifacts(), contract=contracts())
def test_replay_is_byte_equal(artifact, contract):
rehydrated = paxman.replay(artifact, contract)
assert serialize(rehydrated) == serialize(artifact)
These tests run in CI on every PR.
7. The replay API¶
import paxman
# Original run
artifact = paxman.normalize(
input_data=b"raw input",
contract=my_contract,
budget=paxman.Budget(max_total_cost_usd=Decimal("0.10")),
policy=paxman.Policy(allow_remote_inference=True),
)
# ... store artifact somewhere (caller's responsibility) ...
# Later, with just the artifact and the contract
rehydrated = paxman.replay(artifact, contract=my_contract)
assert rehydrated == artifact # byte-equal
The paxman.replay() function:
- Takes the captured
ExecutionArtifactand the same contract used originally. - Returns a rehydrated
ExecutionArtifactthat is byte-equal to the input. - Raises
ReplayError(or one of its subclasses:HashMismatchError,VersionMismatchError,CapabilityNotFoundError,InvalidContractError) on any failure.
7.1 Serialization round-trip¶
The artifact is JSON-serializable. You can:
import json
from paxman.artifact.artifact import ExecutionArtifact
# Serialize
payload = json.dumps(artifact.to_dict(), sort_keys=True, separators=(",", ":"))
# ... store payload somewhere ...
# Rehydrate
artifact = ExecutionArtifact.from_dict(json.loads(payload))
rehydrated = paxman.replay(artifact, contract=my_contract)
The serialization format is stable and versioned. A paxman_version
field in the JSON is used by the replay path to enforce version
compatibility.
7.2 Golden artifacts¶
Paxman ships a set of golden artifact fixtures in
tests/fixtures/artifacts/. Each golden is a JSON snapshot of a
real paxman.normalize() run. Replay-equality tests verify that
paxman.replay(golden, contract) returns a rehydrated artifact that
is byte-equal to the golden.
The goldens are bootstrapped from real implementations, never
predicted. The bootstrap procedure is documented in
tests/fixtures/artifacts/GENERATION.md.
8. What to do if replay fails¶
| Error | Likely cause | Caller action |
|---|---|---|
HashMismatchError |
The artifact was modified or corrupted. | Investigate the source of modification. The artifact is no longer trustworthy. |
VersionMismatchError |
The Paxman version does not support the artifact. | Upgrade Paxman or regenerate the artifact under the current version. |
CapabilityNotFoundError |
A pinned capability is no longer registered. | Register the missing capability or regenerate the artifact with available capabilities. |
InvalidContractError |
The contract supplied to replay is invalid or has been tampered. | Investigate the source of the contract. |
Replay is fail-closed: a failure means the artifact cannot be trusted, and the caller must take explicit action.
9. See also¶
- REPLAY_AND_DETERMINISM.md — the full deep dive on replay and determinism.
- ARCHITECTURE.md §8 Artifact Subsystem — internal architecture of the artifact subsystem.
- ARCHITECTURE.md §9 Versioning Strategy — version compatibility policy.
- GLOSSARY.md §Replay — vocabulary.
- ADR-0005 — confidence ownership (related: confidence is deterministic for replay).
- docs/howto/replay_artifact.md — a quick start for using replay.