Experiment exp_mdl: Minimum Description Length for Sparse Parity¶

Date: 2026-03-06 Status: SUCCESS Approach: #17 -- MDL / Compression. The best compressor of the label sequence is the one that knows the secret bits.

Hypothesis¶

For each candidate k-subset S, compute the parity under S and measure how well it compresses the label sequence. The true subset produces a perfectly compressible (zero-entropy) residual; wrong subsets produce ~50% error rate residuals that cost ~n_samples bits. MDL is more general than Fourier correlation: it works for any deterministic labeling function and degrades gracefully under label noise.

Config¶

Parameter	Value
n_bits	20, 50
k_sparse	3, 5
n_samples	500
seeds	42, 43, 44
noise_rate	0% (clean), 5% (robustness test)
DL formula	n_samples * H(residual_rate), H = binary entropy
Comparison	Fourier/Walsh-Hadamard correlation on same data

Results¶

Clean data (no noise)¶

Config	C(n,k)	MDL correct	MDL avg time	Fourier correct	Fourier avg time
n=20, k=3	1,140	3/3	0.0133s	3/3	0.0118s
n=50, k=3	19,600	3/3	0.2673s	3/3	0.2090s
n=20, k=5	15,504	3/3	0.2542s	3/3	0.1868s

True subset DL = 0.0 bits (perfect compression) in all cases
Wrong subset DL averages ~499.3 bits (out of 500 samples), confirming near-chance residual rate

Noise robustness (5% label flips)¶

Config	MDL correct	MDL best DL (bits)	Fourier correct	Fourier best corr
n=20, k=3	3/3	157.0 avg	3/3	0.887 avg
n=50, k=3	3/3	107.0 avg	3/3	0.932 avg
n=20, k=5	3/3	157.0 avg	3/3	0.887 avg

Under 5% noise, the true subset DL rises from 0 to ~107-164 bits (= 500 * H(0.05) ~ 143 bits) but remains far below wrong subsets (~499 bits)
Both MDL and Fourier find the correct subset under 5% noise -- separation is large enough

MemTracker (n=20, k=3, seed=42)¶

Metric	Value
Total floats accessed	2,290,500
Reads	2,280
Writes	2
Weighted ARD	1,147,375 floats

ARD is large because every subset scan re-reads x and y from scratch (streaming pattern, one pass per subset).

Analysis¶

What worked¶

MDL finds the exact secret subset on all configs with 100% accuracy, matching Fourier
Noise robustness confirmed: 5% label noise does not degrade MDL -- the gap between true subset DL (~150 bits) and wrong subset DL (~499 bits) is enormous
Zero-entropy signal is unambiguous: the true subset compresses to exactly 0 bits on clean data, while even the next-best wrong subset sits at ~490+ bits
MDL is conceptually more general: unlike Fourier correlation which exploits the multiplicative structure of parity, MDL only requires that the true hypothesis produces a low-entropy residual -- it would work for any deterministic labeling function

What didn't work¶

MDL is ~30% slower than Fourier on the same data: MDL computes residual_rate + binary_entropy per subset, while Fourier computes a single dot product + abs. The entropy computation (with log2) is more expensive than a mean
No practical advantage over Fourier for parity specifically: both achieve 100% accuracy on all configs. MDL's generality does not help when the labeling function is known to be parity

Key insight¶

MDL and Fourier are equivalent for sparse parity in the noiseless case: both produce a binary decision (true subset scores 0/1.0, wrong subsets score ~499/~0). Under noise, both degrade gracefully. The MDL framework is theoretically more principled (it directly measures compression quality rather than correlation), but for parity the distinction is academic.

Quantitative comparison with other approaches¶

Method	n=20/k=3	n=50/k=3	n=20/k=5
MDL	0.013s, 100%	0.267s, 100%	0.254s, 100%
Fourier	0.012s, 100%	0.209s, 100%	0.187s, 100%
Random search	0.011s, 100%	0.142s, 100%	0.426s, 100%
SGD (baseline)	0.12s, 100%	FAIL (54%)	14 epochs

Open Questions¶

Would MDL outperform Fourier on a non-parity labeling function (e.g., threshold, majority, or XOR-of-subsets)?
Can we speed up MDL by computing residual rates in batch (vectorized over subsets)?
At what noise level does MDL start failing? Theoretically when 500 * H(noise_rate) approaches 500 * H(0.5), i.e. noise > ~40%
Could a two-stage approach work: first screen by MDL, then verify by Fourier?

Files¶

Experiment: src/sparse_parity/experiments/exp_mdl.py
Results: results/exp_mdl/results.json