Experiment: GF(2) with Noisy Labels¶
Hypothesis: GF(2) Gaussian elimination assumes exact parity. With label noise, the linear system becomes inconsistent. We test at what noise rate it fails and whether robust methods can recover.
Date: 2026-03-09 Status: SUCCESS (robust solver) / FAILED (basic solver)
Summary¶
| Method | Noise Tolerance | Notes |
|---|---|---|
| Basic GF(2) | 0% (1% causes failure) | Inconsistent system, no solution |
| Robust (subset sampling) | Up to 10-15% | Finds clean subset of equations |
| Robust at 15% noise | 65% accuracy | Degraded but usable |
| Robust at 20% noise | 20% accuracy | Too much noise |
Key finding: GF(2) is extremely fragile to noise, but subset-sampling recovers the secret at moderate noise levels by finding clean equations.
Config¶
| Parameter | Value |
|---|---|
| n_bits | 20 |
| k_sparse | 3 |
| n_samples | 21-500 (varied) |
| noise_rate | 0-30% (varied) |
| seeds | 10-50 per config |
| max_subsets | 100 (robust solver) |
Results¶
Experiment 1: Basic GF(2) Noise Sweep¶
At n_samples=100, basic GF(2) with Gaussian elimination:
| Noise Rate | Correct | Inconsistent |
|---|---|---|
| 0% | 20/20 | 0 |
| 1% | 0/20 | 20 |
| 2%+ | 0/20 | 20 |
Even 1 corrupted label out of 100 causes complete failure. The linear system over GF(2) has no solution when any equation is wrong.
Experiment 2: More Samples Don't Help Basic Solver¶
| n_samples | Noise 1% | Noise 2% | Noise 5% |
|---|---|---|---|
| 21 | 9/10 | 9/10 | 4/10 INC |
| 30 | 10/10 | 10/10 | 0/10 INC |
| 50 | 10/10 | 0/10 INC | 0/10 INC |
| 100 | 0/10 INC | 0/10 INC | 0/10 INC |
| 500 | 0/10 INC | 0/10 INC | 0/10 INC |
Counter-intuitively, more samples make things worse for the basic solver. With 21 samples and 1% noise (0.21 expected flips), you might get lucky and have no flips. With 100 samples and 1% noise (1 expected flip), you're guaranteed inconsistency.
Experiment 3: Robust Subset-Sampling Solver¶
The robust solver samples multiple subsets of n equations and votes on the most common solution:
| Noise | Basic GF(2) | Robust | Improvement |
|---|---|---|---|
| 0% | 20/20 | 20/20 | +0 |
| 2% | 0/20 | 20/20 | +20 |
| 5% | 0/20 | 20/20 | +20 |
| 10% | 0/20 | 20/20 | +20 |
| 15% | 0/20 | 13/20 | +13 |
| 20% | 0/20 | 4/20 | +4 |
The robust solver works because: With 100 samples and 10% noise, ~90 equations are clean. If we sample subsets of 20, we'll find at least one clean subset with high probability. The true solution appears in a majority of consistent subsets.
Algorithm: Robust GF(2) Solver¶
def gf2_solve_robust(x, y, n_bits, max_subsets=100):
"""
For noisy data, sample multiple subsets of n equations and vote.
"""
A = ((x + 1) / 2).astype(np.uint8) # Convert to GF(2)
b = ((y + 1) / 2).astype(np.uint8)
solution_counts = {}
for indices in random_subsets(n_samples, n_bits, max_subsets):
solution, rank = gf2_gauss_elim(A[indices], b[indices])
if solution is not None: # consistent
key = tuple(solution.tolist())
solution_counts[key] = solution_counts.get(key, 0) + 1
# Return most common solution
best = max(solution_counts, key=solution_counts.get)
return best
Analysis¶
Why is GF(2) so fragile?¶
GF(2) Gaussian elimination solves A·s = b over the binary field. Each equation is:
A single flipped label makes one equation wrong. In the augmented matrix [A|b], this creates a row where the A part is in the row space, but the b part doesn't match. The consistency check fails.
Why does subset sampling work?¶
With m samples and noise rate p: - Expected clean samples: m·(1-p) - Probability a random n-subset is clean: (1-p)^n
For m=100, p=0.10, n=20: - Clean samples: 90 - P(clean subset) ≈ 0.9^20 ≈ 12%
With 100 subset samples, we expect ~12 clean subsets, all producing the same (correct) solution. Wrong solutions appear in at most a few subsets. Majority voting filters them out.
Comparison with MDL¶
DISCOVERIES.md notes that MDL is noise-robust at 5% noise. Comparison:
| Method | 5% noise | 10% noise | ARD | Time |
|---|---|---|---|---|
| MDL | 100% | untested | 1,147,375 | 0.013s |
| GF(2) basic | 0% | 0% | ~500 | 0.0005s |
| GF(2) robust | 100% | 100% | ~500 | ~0.01s |
GF(2) robust is comparable to MDL at 5% noise, faster at 0.01s vs 0.013s, and has 2000x better ARD (~500 vs 1,147,375).
Open Questions¶
-
What's the theoretical noise threshold? The robust solver works up to ~15% empirically. Is there a closed-form bound?
-
Optimal subset count? We used 100 subsets. Fewer might work, reducing compute.
-
Comparison with error-correcting codes? LDPC or BCH codes are designed for exactly this problem. Would they be faster?
-
Higher k? With k=5 or k=7, the secret is larger (more bits in the solution). Does noise tolerance scale?
Files¶
- Experiment:
src/sparse_parity/experiments/exp_gf2_noise.py - Results:
results/exp_gf2_noise/results.json - This document:
findings/exp_gf2_noise.md