On the Spatiotemporal Dynamics of Generalization in Neural Networks
Zichao Wei

TL;DR
This paper introduces a physics-inspired neural architecture called SEAD that achieves perfect length generalization in tasks like addition and cellular automata by enforcing locality, symmetry, and stability.
Contribution
The authors derive a novel neural cellular automaton architecture based on physical postulates, enabling robust, scale-invariant generalization in neural networks.
Findings
Achieved perfect length generalization in parity task.
Demonstrated scale-invariant addition from 16 to 1 million digits.
Learned a Turing-complete cellular automaton without divergence.
Abstract
Why do neural networks fail to generalize addition from 16-digit to 32-digit numbers, while a child who learns the rule can apply it to arbitrarily long sequences? We argue that this failure is not an engineering problem but a violation of physical postulates. Drawing inspiration from physics, we identify three constraints that any generalizing system must satisfy: (1) Locality -- information propagates at finite speed; (2) Symmetry -- the laws of computation are invariant across space and time; (3) Stability -- the system converges to discrete attractors that resist noise accumulation. From these postulates, we derive -- rather than design -- the Spatiotemporal Evolution with Attractor Dynamics (SEAD) architecture: a neural cellular automaton where local convolutional rules are iterated until convergence. Experiments on three tasks validate our theory: (1) Parity -- demonstrating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
