Composing Global Solutions to Reasoning Tasks via Algebraic Objects in Neural Nets
Yuandong Tian

TL;DR
This paper uncovers algebraic structures in 2-layer neural networks trained on reasoning tasks, enabling analytical construction of global solutions from partial ones, with implications for understanding training dynamics and overparameterization.
Contribution
It introduces CoGS, a framework revealing algebraic structures in neural network solution spaces, allowing explicit construction of global solutions from partial solutions.
Findings
95% of gradient descent solutions match theoretical constructions
Overparameterization decouples training dynamics and aids in finding solutions
Weight decay favors simpler solutions over high-order memorization
Abstract
We prove rich algebraic structures of the solution space for 2-layer neural networks with quadratic activation and loss, trained on reasoning tasks in Abelian group (e.g., modular addition). Such a rich structure enables \emph{analytical} construction of global optimal solutions from partial solutions that only satisfy part of the loss, despite its high nonlinearity. We coin the framework as CoGS (\emph{\underline{Co}mposing \underline{G}lobal \underline{S}olutions}). Specifically, we show that the weight space over different numbers of hidden nodes of the 2-layer network is equipped with a semi-ring algebraic structure, and the loss function to be optimized consists of \emph{sum potentials}, which are ring homomorphisms, allowing partial solutions to be composed into global ones by ring addition and multiplication. Our experiments show that around of the solutions obtained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNeural Networks and Applications · Constraint Satisfaction and Optimization · Fuzzy Logic and Control Systems
