Extent identity¶
Intent¶
An extent identity is the atomic unit of payload truth in crushr.
It binds payload bytes to a deterministic, verifiable identity independent of metadata completeness or broader container state.
Guarantees¶
- Verified data is never silently corrupted or misrepresented
- Unverifiable data is never presented as valid
- Degraded or partial results are explicitly labeled and structured
- Archive processing fails closed when required truth cannot be established
- Filesystem writes are constrained and cannot escape intended boundaries
Behavior¶
Structure¶
| Field | Size | Description |
|---|---|---|
| hash | 32 bytes | BLAKE3 hash of raw extent payload |
| length | 8 bytes | exact byte length of extent |
| offset | 8 bytes | logical offset within original file |
| flags | 4 bytes | bitfield describing extent properties |
Hash derivation¶
text
hash = blake3(payload_bytes)
No normalization. No framing. Raw payload only.
Semantics¶
Extent identity is intentionally independent from naming, path, and surrounding metadata.
This separation is foundational to crushr's recovery model:
- payload truth can survive metadata loss
- metadata loss does not imply payload corruption
- recovery classification can preserve verified data even when identity context is incomplete
Constraints¶
- Any mutation of payload invalidates the extent identity
- Offset is descriptive, not sufficient on its own to prove identity
- No dual identity systems are permitted within the archive model
Boundaries / Non-goals¶
This page does not define file naming, recovery output classes, or metadata restoration policy.
Non-goals:
- No best-effort reconstruction
- No hidden failure smoothing
- No compression-first tradeoffs
- No external decode dependencies
Example¶
text
payload: 0x48656c6c6f
hash: blake3(payload)
length: 5
offset: 1024
flags: 0x01
Extent identity is a root invariant of the format. Higher-level recovery behavior depends on it but does not replace it.