Skip to content

It Begins

Chronicles entry — originally published externally and preserved here as a historical milestone.

Context

  • Project: crushr
  • Series: Chronicles
  • Status: Historical snapshot
  • Original venue: LinkedIn

I Built a Compression Format With AI Acting as My Engineering Partner

March 13, 2026

Over the past few months I’ve been building a compression format called crushr.

The goal was not to replace ZIP tomorrow or compete with zstd on performance benchmarks. The real purpose was to explore something else:

What happens when you treat AI as part of a disciplined engineering workflow rather than as a code generator.

Compression formats are a good test for this kind of experiment.

They touch many of the harder parts of systems engineering. You have to think about binary layout, corruption behavior, integrity guarantees, indexing structures, and long-term compatibility. A mistake in the design can surface months later when someone tries to extract data from a damaged archive.

In other words, it is the kind of problem where engineering discipline actually matters.

That made it a useful place to see how AI fits into a real development process.


The ground rule

The first rule of the project was simple.

AI was not allowed to behave like a magic coding assistant.

Instead, it had to operate inside a normal engineering workflow with the same constraints a human developer would have. That meant planning documents, explicit architecture decisions, task definitions, hostile peer review, and a recorded project state.

Every piece of work was done through structured task packets. The AI did not just write code. It received a defined objective, implemented it, and then its work was reviewed before it could be accepted into the codebase.

Within that workflow the AI effectively played several roles.

Planner. Builder. Reviewer. Controller.

Each role had clear responsibilities and boundaries.

If something was ambiguous, the system had to stop and ask instead of inventing an interpretation.


What surprised me

What surprised me is that this approach actually works.

When AI is used without structure it tends to generate a lot of code quickly, but the results often feel fragile. Architecture drifts. Documentation falls out of sync. Small decisions compound into confusing systems.

Inside a structured process the behavior changes completely.

The AI becomes very good at implementing well-defined tasks, reviewing architectural cleanliness, identifying inconsistencies between documents and code, and enforcing constraints that humans often forget once a project gets moving.

In practice, it feels less like an automatic programmer and more like a very fast junior engineer paired with an extremely stubborn reviewer.


What we built

The current state of the project is the baseline implementation of the crushr archive format.

That includes the format structure itself along with the surrounding tooling needed to evaluate it properly. The repository contains deterministic archive generation, structured extraction behavior, and a research harness for testing how archives behave when corruption occurs.

Before publishing anything publicly we are running a full experimental trial comparing crushr against several common formats:

ZIP, tar + zstd, tar + gzip, tar + xz

The purpose is not about producing marketing data, it's measurement for refinement and addressing weaknesses.

The experiment generates thousands of corruption scenarios across different archive regions and corruption magnitudes. Every scenario comes from a deterministic manifest so the exact set of tests can be reproduced later.

Each run produces machine-readable records describing what happened during extraction. Those records are normalized into a common schema so the results can be compared across formats.

The goal is simple. Every chart or claim in the eventual white paper should be traceable back to raw experimental artifacts.

No cherry-picked results.


What AI was not allowed to do

Many demonstrations of AI programming look impressive until you examine the details. To avoid that trap, several restrictions were put in place.

AI was not allowed to silently change architectural decisions. It could not rewrite planning documents without explicit instruction. It could not skip testing steps. It could not reinterpret requirements creatively.

If a requirement was unclear, the system had to stop and ask.

Those constraints ended up being more important than any technical trick.


Lessons so far

A few things became clear during this process.

First, the engineer still has to own the architecture. AI can accelerate implementation and review, but design decisions and tradeoffs still require human judgment.

Second, documentation becomes even more important when multiple agents interact with the same codebase. The documents effectively become the coordination layer that keeps the system coherent.

Third, engineering discipline scales surprisingly well when AI is involved. Decision logs, architecture maps, deterministic workflows, and clear task definitions make collaboration much smoother.

None of those practices are new. AI just makes their value more obvious.


What happens next

The next phase of the project is running the full corruption trial matrix and collecting the results.

Those results will feed into a technical white paper describing the methodology, the data, and what the current design does well or poorly compared with existing formats.

The format itself will continue evolving after that. Several capabilities are planned for future versions, including recoverable archives, random-access extraction, and deduplication.

Those features are intentionally being deferred until after the baseline evaluation so the current results describe the format exactly as it exists today.


The bigger takeaway

Working on this project changed how I think about AI in engineering.

The interesting part is not automatic code generation.

The interesting part is using AI as a collaborator inside a well-designed development process.

When the workflow is structured and the constraints are clear, AI becomes a useful force multiplier rather than a source of chaos.

I suspect we are only beginning to see what that style of development can look like.


If you are experimenting with similar workflows, I would be interested to hear how it has worked for you.