Detailed Meeting Notes¶

Meeting #1, 19 Jan 26 - Energy-Efficient Training¶

Location: SPC main floor · Full notes · Google Doc

Orientation meeting. Introductions and backgrounds. Concepts introduced:

Memory cost is the largest energy contributor (Bill Daly talk)
Local registers ~5pJ vs HBM ~640pJ
Backprop is like the giraffe's recurrent laryngeal nerve -- works but inefficient
"Nerd snipe" proposal: train a model on smartphone via WebGPU using minimum joules
WebGPU exposes memory hierarchy (Registers -> Shared -> Global)

Takeaway

Yaroslav beat Google in 2018 DawnBench (fastest ImageNet training) not through superior intelligence but 3 months optimizing AWS infrastructure for 10-second restart cycles versus Google's 10+ minutes.

Meeting #2, 26 Jan 26 - Forward-Forward Algorithm¶

Location: Accel board room · Full notes · Google Doc

Discussion of Hinton's Forward-Forward paper. See also: Exp E - Forward-Forward findings.

Two forward passes (positive/negative) replace forward+backward
Greedy layer-wise learning: each layer has its own objective
Goodness = sum of squared ReLU activations
Negative data generation is the hard problem for complex domains
Jamie Simon shared implementation results

Meeting #3, 02 Feb 26 - Joules Measuring¶

Location: SPC

Tooling session.

Barak demonstrated Modal workflow
Yaroslav demonstrated Colab workflow
Joules-measuring notebook

Meeting #4, 09 Feb 26 - From Beauty to Joules¶

Location: Palmer Square

Presentation: From_Beauty_to_Joules.pdf

Meeting #5, 16 Feb 26 - Intelligence Per Joule¶

Presentation: Intelligence_Per_Joule.pdf

Karpathy Names Task introduced:

Take 1000 random names from makemore/names.txt
Predict last 3 characters of 1000 test names
Baseline accuracy + total operations -> optimize

Meeting #6, 23 Feb 26 - Presentations¶

Full notes · Google Doc

Germaine: presentation video — truncated backprop, 19% energy reduction, 27% intelligence-per-joule improvement
Emmett: Pure-Python GPT, reduced memory 80MB -> 35MB with Aster (local · Google Doc)
Yaroslav presented pebbling games, energy hierarchy, "drosophila of learning" concept
Key outcome: 3-minute MicroGPT iteration too slow — need sub-1-second task

Meeting #7, 02 Mar 26 - Sparse Parity¶

Full notes · Google Doc

Yaroslav presented Technical Sprint 1 results — 2.5hr sprint, ARD metric, gradient fusion (16% cache reuse improvement)
Andy attempted better chat tooling (codeberg)
Michael showed Pebbling Game implementation
Homework assigned: Challenge #1: Sparse Parity ("Drosophila of Learning")

See also: Research overview for all experiment results building on this challenge.

Meeting #8, 09 Mar 26 - Demos and Roadmap¶

Full notes · AI notes · Google Doc

Yad: Demoed the Claude Code agentic harness (video, survey, github). Harness found 1000x faster solution via GF(2). Yaroslav verified correctness and visualized the top algorithm.
Yaroslav: Presented Knowledge Sprint #2 on energy metrics and the bigger picture roadmap (3-axis cube: process, metric, problem).
Michael: Showed his Claude approach which preferred 90s-era methods.
Germain: Demoed supervisor/researcher harness; solutions preferred 2010s methods.
Uliana: Gave temperature suggestions for Germain's experiments.

Homework for next Monday: Get agents to improve Challenge #1 using ARD as the energy proxy. Present results, process, and learnings.