Testing Harness¶

crushr includes a purpose-built corruption harness used to answer a simple question:

Which parts of the archive design actually improve recoverability, and which parts only add complexity?

This harness is one of the most important parts of the project. It is how crushr moves from speculation to evidence.

What the harness does¶

At a high level, the harness:

builds deterministic archive variants
applies controlled corruption
runs salvage analysis
classifies what remains recoverable
compares format variants against each other

Why the harness exists¶

Most archive formats are designed around intact containers. crushr is explicitly testing damaged ones.

The harness makes it possible to ask questions like:

Does self-identifying payload data improve recovery?
Do extra metadata layers survive often enough to justify their size?
Does metadata placement matter?
When does recovery downgrade from named to anonymous or orphaned?

High-level workflow¶

flowchart TD
    A["Dataset corpus"] --> B["Build archive variants"]
    B --> C["Apply deterministic corruption"]
    C --> D["Run salvage analysis"]
    D --> E["Classify recovery"]
    E --> F["Aggregate results"]
    F --> G["Design decision"]

Corpus used by the harness¶

The harness operates on a bounded set of representative dataset shapes:

smallfiles — many small items and metadata-heavy structure
mixed — a blend of file sizes and types
largefiles — fewer large payloads where block-level recovery matters more

These profiles are useful because different archive strategies can succeed or fail differently depending on the dataset shape.

Core corruption families¶

The harness started with simple structural damage targets:

header damage
index damage
payload damage
tail truncation

As crushr evolved, later phases introduced richer corruption families targeting metadata survival and graph reasoning.

Recovery classes used by salvage¶

The harness records recovery using the recovery classes exposed by the salvage engine.

Class	Meaning
`FULL_NAMED_VERIFIED`	Complete recovery with trusted name/path
`FULL_VERIFIED`	Complete recovery without trusted name/path
`PARTIAL_ORDERED_VERIFIED`	Partial recovery with provable order
`PARTIAL_UNORDERED_VERIFIED`	Partial recovery without proven order
`ORPHAN_EVIDENCE_ONLY`	Verified fragments with no file reconstruction
`NO_VERIFIED_EVIDENCE`	Nothing salvageable remains

These classes matter because crushr is not just measuring “did extraction succeed?” It is measuring what level of truth still survives.

Format research phases¶

The harness has been used to evaluate several experimental format directions.

FORMAT-05 — self-identifying payload blocks¶

This phase introduced payload blocks that carry local truth about what they belong to.

Result: this was the biggest structural win so far. Recovery improved substantially because surviving payload blocks could still be identified even when metadata was damaged.

FORMAT-06 — file manifests¶

This phase added file-level structural truth to improve recovery confidence and completeness classification.

Result: useful for richer reasoning, but not the main source of recovery gains.

FORMAT-07 — graph-aware salvage reasoning¶

This phase changed the salvage engine so it reasons over surviving verified relationships instead of flat metadata checks.

Result: richer and more explicit classification, with stable behavior across scenarios.

FORMAT-08 — metadata placement strategies¶

Three metadata placement strategies were compared:

fixed_spread
hash_spread
golden_spread

Result: no measurable difference. Metadata never survived in the tested scenarios, so placement had no effect.

FORMAT-09 — metadata survivability and necessity audit¶

This phase directly tested whether metadata layers survive often enough to matter.

Result: metadata survival counts remained effectively zero, strongly suggesting that the real resilience mechanism is payload truth rather than traditional metadata.

What the harness has revealed so far¶

The biggest lesson from the experiments is this:

the archive's recoverability is being driven by self-identifying payload blocks,
not by duplicated metadata surfaces

That has pushed the project toward simplification rather than accumulation.

Why this is unusual¶

In most archive formats:

metadata is authoritative
payload depends on metadata

In crushr's emerging model:

payload proves what it is
metadata helps when it survives

This is a meaningful difference in how recoverability is approached.

Harness outputs¶

The harness generates compact machine-readable and human-readable summaries.

Typical outputs include:

comparison summary JSON
comparison summary Markdown
grouped results by dataset
grouped results by corruption target
recovery class distributions
metadata survival statistics

These outputs are used to decide whether a design change should be kept, changed, or removed.

Example decision loop¶

flowchart LR
    A["Add experimental feature"] --> B["Run corruption suite"]
    B --> C{"Improves recovery?"}
    C -- Yes --> D["Keep or refine"]
    C -- No --> E["Prune or demote"]

This is how crushr is being shaped into something smaller and more evidence-driven over time.

What comes next¶

The next pruning and simplification phases use the harness to answer:

Which metadata layers are actually necessary?
Which are just archive overhead?
What is the minimal truth surface needed for strong recovery?

That is the direction the project is currently heading.

Practical takeaway¶

If you only remember one thing about the harness, make it this:

crushr is not being designed by theory alone. It is being designed by repeatedly breaking archives and measuring what still lives.