A parameter-free framework for recursive bit-plane decomposition of arbitrary byte data. Every decision — which bit to separate, which codec to use, when to halt — is determined entirely by the data at runtime. No training. No thresholds. No tuning constants. The algorithm terminates within 8 depths for any input by proof, not by parameter.
Run AIM's entropy analysis directly in the browser. Drop any file to generate a full decay profile, bit-plane distribution fingerprint, and codec recommendation report. The same structural logic that powers the C implementation — no server, no upload.
Byte-level sliding window compressors treat every byte the same. But the high byte and low byte of a PCM sample have completely different entropy profiles. Consecutive YUV pixels repeat with period 4. Consecutive struct fields align at their width. These structures are invisible to LZ77.
AIM decomposes data losslessly into structured and entropic components — isolating the compressible from the incompressible. The result is a decay profile: a mathematical fingerprint of the data's internal structure. It correctly identifies when not to apply advanced techniques — returning near-parity on random data and accepting that gzip wins on natural language text.
Sweep all 8 bit positions across the full stream. Find the sparsest bit plane.
Extract the plane as a flag set. Seven codecs race — shortest encoding wins.
Clear the bit, remap to a halved symbol alphabet, recurse on the aligned stream.
Halt conditions fire when continuing costs more than stopping. At depth 8, halt is proven — not parameterized.
Mathematical sequences with strong arithmetic regularity halt in 2–4 layers. Prime Gaps halt at depth 2 with a flag ratio of 0.001. Fibonacci mod 256 halts at depth 4 with a flag ratio of exactly 2/3 — a deterministic arithmetic property, not an approximation.
Decay depth alone is insufficient to distinguish data types. Natural Language and Random Noise both run 13 layers with nearly identical flag ratios — but their bit distributions at L0 are completely different. The fingerprint requires both decay profile and bit distribution.
The sparsest bit plane and the optimal entropy target can differ. For Prime Gaps: the sweep picks bit 6 (zero flags, zero gain). The entropy winner is bit 4 — 190 flags, but a genuine −6.53% net entropy reduction because clearing it concentrates the value distribution.
First confirmed case of genuine entropy reduction — not relocation. The condition: the target bit must be set for a structurally distinct, sparse minority whose positions are cheap to encode and whose removal genuinely narrows the value distribution.
| File | Raw | Output | vs gzip |
|---|---|---|---|
| PCM Audio (38.7 MB WAV) | 38.7 MB | 28.7 MB | −25.90% |
| Uncompressed Video (527 MB YUV) | 527 MB | 194.7 MB | −63.07% |
| Synthetic YUV420 (5 MB) | 5.0 MB | 2.73 MB | −41.27% |
| Synthetic PCM (5 MB) | 5.0 MB | 3.63 MB | −22.70% |
| Source Code | 48 KB | — | +19% — gzip wins on text |
| Random Data | 100 KB | — | +0.14% — near-parity |
AIM is not a universal improvement over gzip. It targets data with bit-plane regularity or inter-symbol periodicity: PCM audio, YUV video, scientific measurements, packed binary formats. For text, gzip wins and AIM correctly identifies this.
Initial Python implementation. REAL transform. Bit-plane sweep. First flag codec race.
Full recursive decomposition. Multiple codec backends. Wire format v4–v8. Roundtrip verification.
Structural fingerprinting. Decay profiles. Confirmed findings F1–F7. AIM Web Analyzer tool.
Full C implementation (aim3). 29× faster than Python. Memory optimization. Large file support.
rANS backends. ANS order-0/1/2d. Stride-k selection. Memory-bounded streaming. Chunk analyzer.
v15. Optimal early cutoff. −25.90% vs gzip on PCM audio. −63.07% on uncompressed video.