Computing the Uncomputable
A visual guide based on the essay by Packy McCormick & Pim De Witte
Have you ever had a dream where you simply watched? That's a video model. Have you ever had a lucid dream where you shaped the story? That's a World Model.
Now try to describe it. In words. Every detail.
Clap your hands five times.
Language is an incredibly lossy compression of reality.
You just did it in 0.3 seconds. Describing it would take pages. Coding it would take months.
Joseph Knecht left Castalia because symbols alone weren't enough.
Perfect symbolic manipulation
Messy, embodied, unpredictable
Large Language Models are our Castalians. They can describe clapping, but they cannot clap.
A batter facing a 100mph fastball must swing before the visual signal of the ball even reaches their brain.
They don't react to reality — they react to their internal World Model's prediction.
A Brief History of World Models
What would a World Model do?
Schmidhuber · Sutton
Before we had the compute, the data, or the architecture, we had the dream. Schmidhuber proposed learning a model of the world. Sutton proposed unifying learning, planning, and reacting. Both were decades ahead of their time.
Proof of Concept
Ha & Schmidhuber · SimPLe
Ha and Schmidhuber asked: Can agents learn inside their own dreams? Using VAEs and RNNs, they trained agents entirely in imagination. SimPLe learned 26 Atari games from just 2 hours of gameplay.
Human Performance
DreamerV2 · MuZero · IRIS · JEPA
DreamerV2 reached human-level on 55 Atari games, trained entirely in imagination on a single GPU. MuZero took the opposite approach — planning in abstract space without generating any frames. The generative vs latent split was born.
Interactivity
GAIA-1 · DIAMOND · Genie
GAIA-1 scaled to 9 billion parameters on real driving video. DIAMOND used diffusion to produce a fully playable Counter-Strike from 87 hours of footage. Genie learned actions from scratch — no one told it the controls.
The Real World
Comma.ai · V-JEPA 2 · SIMA 2 · General Intuition
Comma.ai deployed a World-Model-trained driving policy in production vehicles. V-JEPA 2 controlled robot arms zero-shot. The dream is becoming reality.
Not everything called a 'World Model' is one. Here's how the pieces fit together.
Who's building what, and how much capital is behind them.
Every approach runs into the same wall: it needs better data.
The optimal path forward is likely somewhere between where VLAs are today and where AMI might be one day.
For millennia, we watched shadows on the wall — describing reality in language, code, and symbols.
Then we learned to dream — building models that imagine what the world could look like.
Then we learned to act inside the dream — taking actions, observing consequences, and training in imagination.
Now the dream is becoming reality. Agents trained in World Models are driving cars, controlling robots, and learning skills no one explicitly taught them.
The real world is — or was — uncomputable.
World Models are changing that.
From Plato's cave to Schmidhuber's dreams, from Ha's car racing game to autonomous vehicles on Tokyo streets — we are learning to build machines that understand our world not through words, but through actions.
Read the full essay on Not Boring →