Compute
Maximize useful inference work per watt of electrical power.
Human baseline
~20 W brain-equivalent inference
Competition goal
Maximum useful work per watt (tokens/joule, FLOPS/watt)
The Compute track challenges teams to build or benchmark AI systems that deliver brain-equivalent useful work at human-scale power budgets. This includes LLM inference density, tokens per joule, and FLOPS per watt on tasks with documented human performance baselines.
Example metrics
- Tokens generated per joule on a fixed benchmark task
- Useful inference throughput at ≤20 W system power
- Energy per correct answer on reasoning benchmarks
Submissions must define 'useful work' with a reproducible benchmark and compare energy consumption against documented human cognitive performance on the same task class.