Implicit Bias and Invariance: How Hopfield Networks Efficiently Learn Graph Orbits
Michael Murray, Tenzin Chan, Kedar Karhadker, Christopher J. Hillar

TL;DR
This paper demonstrates that Hopfield networks can implicitly learn graph isomorphism classes by leveraging an invariant subspace, with implicit bias towards norm-efficient solutions enabling efficient generalization on group-structured data.
Contribution
It reveals how Hopfield networks inherently develop invariance to graph symmetries and provides theoretical insights into their sample complexity and convergence behavior.
Findings
Graph isomorphism classes are represented in a 3D invariant subspace.
Gradient descent on energy flow biases solutions toward norm efficiency.
Parameters converge to the invariant subspace as sample size increases.
Abstract
Many learning problems involve symmetries, and while invariance can be built into neural architectures, it can also emerge implicitly when training on group-structured data. We study this phenomenon in classical Hopfield networks and show they can infer the full isomorphism class of a graph from a small random sample. Our results reveal that: (i) graph isomorphism classes can be represented within a three-dimensional invariant subspace, (ii) using gradient descent to minimize energy flow (MEF) has an implicit bias toward norm-efficient solutions, which underpins a polynomial sample complexity bound for learning isomorphism classes, and (iii) across multiple learning rules, parameters converge toward the invariant subspace as sample sizes grow. Together, these findings highlight a unifying mechanism for generalization in Hopfield networks: a bias toward norm efficiency in learning drives…
Peer Reviews
Decision·ICLR 2026 Conference Desk Rejected Submission
The authors study multiple classes of graphs, and they report multiple trials in their experiments. MEF is experimentally shown to require fewer samples to perfectly memorize the datasets.
The manuscript is not very well written, and it is often hard to follow through. The text is often obfuscated with equations, the introduction reads more like a method section, and a conclusion is missing. The significance of the problem studied is also not clearly stated in the manuscript.
- Good conceptual contribution to connect implicit bias -> min-norm solutions to emergent invariance under group actions for a classical architecture. - Theoretical framing looks solid, and consistent experiments that supportively show rapid orbit coverage and convergence of weights toward the invariant subspace. - The results are stated for any isomorphism class, and three families of cliques, bipartite, and Paley graphs are shown to illustrate. - Public code is promised and the empirical s
- Payley graph -> Paley graph? This happens a couple of times in the paper / appendix. - Some of the limitations are already acknowledged by authors. The paper does not prove that HSVM/MEF solutions converge to the invariant subspace for the actual training problem—only for an averaged surrogate (AHSVM) and via empirical evidence. - Results focus on three families and relatively small graph sizes (v=8 and 20). It would help to include more and larger v and more orbit types.
1. Novel Theoretical Framework - First rigorous analysis connecting classical Hopfield networks to modern implicit bias theory - Elegant characterization of invariant parameters as a 3-dimensional subspace regardless of graph size - Creative reformulation of memorization as linear programming, bridging associative memory and SVM theory 2. Strong Mathematical Foundations - Rigorous proofs using group theory, convex optimization, and statistical learning theory - Clear geometric intuiti
1. Severe Practical Limitations - Restricted to classical Hopfield networks with linear capacity O(n) vs. exponential capacity of modern architectures - Experiments limited to toy graphs (≤20 vertices) with questionable real-world relevance - No comparison with modern graph neural networks or graph isomorphism algorithms that vastly outperform this approach 2. Significant Theory-Practice Gaps - Theorem 3.2 requires exponential iterations for convergence, but experiments use only 1000 it
The theory is novel and the analysis seems solid.
The HN model is somewhat outdated and the binary requirement on the input is quite restrictive. Thus, it is unclear whether the results have implications for modern or practical settings. I am not sure that the results/analysis can be extended to other models or non-binary data.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Neural Networks and Applications · Graph Theory and Algorithms
