From Kepler to Newton: Inductive Biases Guide Learned World Models in Transformers
Ziming Liu, Sophia Sanborn, Surya Ganguli, Andreas Tolias

TL;DR
This paper shows that simple inductive biases in transformer architectures enable the learning of physical laws, transitioning from mere curve-fitting to discovering Newtonian mechanics in world models.
Contribution
The authors introduce three minimal inductive biases that allow generic Transformers to learn physical laws, bridging the gap between high prediction accuracy and true scientific understanding.
Findings
Transformers with spatial smoothness and stability learn Keplerian orbits.
Adding temporal locality enables discovery of Newtonian force laws.
Architectural biases determine whether models fit curves or understand physics.
Abstract
Can general-purpose AI architectures go beyond prediction to discover the physical laws governing the universe? True intelligence relies on "world models" -- causal abstractions that allow an agent to not only predict future states but understand the underlying governing dynamics. While previous "AI Physicist" approaches have successfully recovered such laws, they typically rely on strong, domain-specific priors that effectively "bake in" the physics. Conversely, Vafa et al. recently showed that generic Transformers fail to acquire these world models, achieving high predictive accuracy without capturing the underlying physical laws. We bridge this gap by systematically introducing three minimal inductive biases. We show that ensuring spatial smoothness (by formulating prediction as continuous regression) and stability (by training with noisy contexts to mitigate error accumulation)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Generative Adversarial Networks and Image Synthesis · Explainable Artificial Intelligence (XAI)
