Experiment exp_fourier: Fourier/Walsh-Hadamard Solver for Sparse Parity¶

Date: 2026-03-04 Status: SUCCESS Answers: Blank-slate approach. Can we solve sparse parity without neural nets?

Hypothesis¶

If we compute Walsh-Hadamard correlations for every k-subset, the true secret will have correlation ~1.0 while all others are ~0, giving an exact solver in O(C(n,k) * n_samples) time with no training.

Config¶

Parameter	Value
n_bits	20, 50, 100, 200
k_sparse	3, 5, 7
hidden	N/A (no neural net)
lr	N/A
wd	N/A
batch_size	N/A
max_epochs	N/A
n_train	500 (20 sufficient for k=3)
seed	42, 43, 44
method	fourier (Walsh-Hadamard exhaustive search)

Results¶

Metric	Value
Best test accuracy	100% (all configs)
Epochs to >90%	N/A (single pass)
Wall time (n=20,k=3)	0.009s
Wall time (n=50,k=3)	0.16s
Wall time (n=20,k=5)	0.14s
Weighted ARD (n=20,k=3)	1,147,375
ARD improvement vs baseline	N/A (different algorithm class)

Summary Table¶

Config	Method	Acc	Time	Subsets	Samples
n=20,k=3	Fourier	100%	0.009s	1,140	500
n=20,k=3	SGD (fast.py)	100%	0.12s	—	1000
n=50,k=3	Fourier	100%	0.16s	19,600	500
n=50,k=3	SGD direct	54% FAIL	—	—	200
n=50,k=3	Curriculum	>90%	—	—	—
n=20,k=5	Fourier	100%	0.14s	15,504	500
n=20,k=5	SGD (n=5000)	100%	—	—	5000
n=100,k=3	Fourier	100%	1.3s	161,700	500
n=200,k=3	Fourier	100%	10.8s	1,313,400	500
n=20,k=7	Fourier	100%	0.7s	77,520	500

Analysis¶

What worked¶

Perfect accuracy on every config tested. Correlation for the true subset is always 1.0.
13x faster than SGD on n=20,k=3 (0.009s vs 0.12s).
Solves n=50,k=3 trivially where SGD fails (54%) and curriculum needs 20 epochs.
Solves n=20,k=7, 77,520 subsets in 0.7s.
Only needs ~20 samples for k=3 (vs 500-5000 for SGD). Sample complexity is O(1/epsilon^2), independent of n.
Scales to n=200,k=3 in 10.8s (1.3M subsets).

What didn't work¶

ARD is terrible: 1,147,375 weighted ARD vs 17,976 for SGD. The algorithm reads the entire dataset for each subset, pure streaming with no locality.
Combinatorial wall at high k: C(n,k) grows fast. C(50,5) = 2.1M, C(100,5) = 75M, C(50,7) = 99M. The method is O(C(n,k)), polynomial in n for fixed k but explodes with k.
Not a learning algorithm. It is a brute-force search that does not generalize or transfer.

Surprise¶

The Fourier solver is 13x faster than the best SGD on n=20,k=3. A trivial brute-force beats the optimized neural net approach.
Only 20 samples are needed for reliable k=3 detection. SGD uses 500-5000.

Open Questions (for next experiment)¶

Can we prune the subset search with single-variable correlations first? (Check each bit's individual correlation, discard low ones, then only check k-subsets of the top candidates.)
Hybrid: use Fourier to find the secret, then a trivial classifier (just compute the product). What is the total energy cost vs SGD?
At what (n,k) does SGD actually beat Fourier in wall time?

Files¶

Experiment: src/sparse_parity/experiments/exp_fourier.py
Results: results/exp_fourier/results.json