Skip to content

SutroYaro

Research workspace for the Sutro Group, a study group exploring energy-efficient AI training. Weekly meetings at South Park Commons, San Francisco.

What this site is for

SutroYaro is the lab's memory and dispatcher — not where the cutting-edge research happens, but where you go to find out what's been tried, who's working on what right now, and how to spin up new builds against the org's other repos. Five things it does:

  1. Lab memoryDISCOVERIES.md is the curated "what's proven" record.
  2. Agent-teams dispatcher — the hinton-problems and schmidhuber-problems catalogues (111 stubs combined) were built from SutroYaro sessions; the reusable machinery lives here.
  3. Cross-repo index — see Related Repos for the full 8-repo map.
  4. Public face — this site, shown at the Monday meetings.
  5. Telegram + Google Docs + GitHub sync — see the Sync Runbook.

The active research front itself has migrated to cybertronai/ByteDMD/experiments/grid. Benchmark problems live in their own per-purpose repos. Cost metrics are defined in ByteDMD and simplified-dally-model. SutroYaro is the integration layer that keeps everything visible and connected.

The Challenges

The group runs several energy-efficient-learning challenges in parallel, each in its own repo:

Challenge Goal Repo
Sparse Parity Learn k-bit XOR for minimum data movement sparse-parity-challenge
Energy-efficient matmul Minimum-energy 16x16 matmul on a 2D grid sutro-problems
Sparse parity on the grid Solve sparse parity in ~9 grid instructions sutro-problems
wikitext Train a WikiText-103 LM for minimum Joules cybertronai/wikitext

See the Challenges index for current records, who is active, open questions, and where to start on each.

Where the Active Research Lives

This repo is the lab notebook for the group's Phase 1 and Phase 2 work. The live research front has migrated to cybertronai/ByteDMD/experiments/grid — Yaroslav's collection of self-contained experiments built around the new ByteDMD metric. If you want the current edge of the work, start there. SutroYaro stays the home for accumulated findings, the autonomous-research lab, the public site, and the sparse parity challenge.

What We Found

36 experiments across three phases, plus GPU energy validation. The full ranked results are in the Practitioner's Field Guide.

Phase 1 (16 experiments): Started with a broken SGD baseline (LR=0.5, stuck at 54%). Fixed hyperparameters to solve it in 0.12s. Optimized data movement within the SGD framework, hitting a ceiling because one tensor (W1) dominates 75% of all float reads. Pivoted to new algorithms.

Phase 2 (17 experiments): Tested algebraic, information-theoretic, local learning, hardware-aware, and alternative approaches in parallel:

Method Time (n=20/k=3) DMD Why it works
SMT Backtracking 0.002s 19,532 Constraint satisfaction with k-1 pruning.
Kushilevitz-Mansour 0.001s 27,165 Flip each bit, measure label change.
GF(2) Gaussian Elimination 0.009s 153,745 Parity is linear over the binary field.
SGD (baseline) 0.089s est. The neural network solves it, just slower.

DMD = Data Movement Distance (Ding et al.), measured via TrackedArray. Lower is better.

All four local learning rules (Hebbian, Predictive Coding, Equilibrium Propagation, Target Propagation) failed at chance level. Parity requires k-th order interaction detection, which local statistics cannot provide.

GPU measurement: Ran methods on NVIDIA L4 via Modal Labs using PyTorch CUDA (5 runs). GPU is 4-790x slower than CPU at this problem size. Sparse parity tensors are too small for CUDA to help. See GPU vs CPU findings.

Quick Start

git clone https://github.com/cybertronai/SutroYaro.git
cd SutroYaro

# Option A: Nix (recommended)
nix develop

# Option B: pip fallback
export PYTHONPATH=$PWD/src:$PYTHONPATH && pip install numpy

# Verify all 14 experiments across 3 challenges in <1 second
python3 bin/reproduce-all

# Run sparse parity with GF(2) (509 microseconds)
python3 src/harness.py --method gf2 --n_bits 20 --k_sparse 3

# Run sparse sum (new challenge)
python3 src/harness.py --challenge sparse-sum --method sgd

# Measure real GPU energy via Modal Labs
pip install modal && modal token set
modal run bin/gpu_energy.py

# Run autonomous agent loop
bin/run-agent --tool claude --max 10

Full setup instructions in CONTRIBUTING.md.

Where to Find Things

What Where
New here? Start here What's New (March 2026)
All 36 experiments ranked Practitioner's Field Guide
Add a new challenge Adding a Challenge
GPU vs CPU findings GPU vs CPU for Sparse Parity
Run experiments with any AI tool Agent CLI Guide
Scripts and toolkit Tooling
Full protocol design Peer Research Protocol
What's been proven so far DISCOVERIES.md
Individual experiment findings Research > Findings
Meeting notes and Google Docs Meetings
Auto DMD tracking Auto-instrumented DMD Tracking
How to contribute CONTRIBUTING.md
Submit a solution Sparse Parity Challenge
Resource Link
Telegram t.me/sutro_group
Code repo cybertronai/sutro
Challenge sparse-parity-challenge
All related repos docs/related-repos.md — curated map of the cybertronai org repos and how they connect
Active threads (snapshot) docs/research/active-threads-2026-05-09.md — what every contributor is working on right now
The Bigger Picture Yaroslav's roadmap
Meetings Mondays 18:00 at South Park Commons (380 Brannan St)

Companion baseline catalogs (shipped May 2026)

The representational + algorithmic baseline pair for v2 ByteDMD instrumentation. Pure numpy, laptop-runnable, paper-comparison metrics per stub:

How the cybertronai org repos fit together

graph LR
    ByteDMD["ByteDMD"]
    SimpleDally["simplified-
dally-model"] SPChall["sparse-parity-
challenge"] Hinton["hinton-problems"] Schmid["schmidhuber-
problems"] SutroP["sutro-problems"] Sutro["sutro"] SY["SutroYaro
(this repo)"] ByteDMD --> SY SimpleDally --> SY SPChall --> SY Hinton --> SY Schmid --> SY SutroP --> SY Sutro --> SY Sutro -.-> SPChall classDef cost fill:#e8d5ff,stroke:#7e3ff2,stroke-width:2px,color:#000 classDef problem fill:#cfe5ff,stroke:#1f6feb,stroke-width:2px,color:#000 classDef lab fill:#d1f4d1,stroke:#1a7f37,stroke-width:3px,color:#000 class ByteDMD,SimpleDally cost class SPChall,Hinton,Schmid,SutroP,Sutro problem class SY lab

Purple = cost-metric repos · Blue = problem repos · Green = the lab notebook (this repo). Full curated map with descriptions: docs/related-repos.md.