Dynamical Systems Theory Behind a Hierarchical Reasoning Model
Vasiliy A. Es'kin, Mikhail E. Smorkalov

TL;DR
This paper introduces the Contraction Mapping Model (CMM), a mathematically grounded recursive reasoning architecture using neural differential equations, achieving state-of-the-art results with significantly fewer parameters.
Contribution
The paper presents CMM, a novel stable recursive reasoning model based on neural differential equations, with rigorous convergence guarantees and exceptional parameter efficiency.
Findings
CMM achieves 93.7% accuracy on Sudoku-Extreme, surpassing larger models.
Even with only 0.26M parameters, CMM maintains high performance on reasoning benchmarks.
CMM outperforms existing models like HRM and TRM in accuracy and stability.
Abstract
Current large language models (LLMs) primarily rely on linear sequence generation and massive parameter counts, yet they severely struggle with complex algorithmic reasoning. While recent reasoning architectures, such as the Hierarchical Reasoning Model (HRM) and Tiny Recursive Model (TRM), demonstrate that compact recursive networks can tackle these tasks, their training dynamics often lack rigorous mathematical guarantees, leading to instability and representational collapse. We propose the Contraction Mapping Model (CMM), a novel architecture that reformulates discrete recursive reasoning into continuous Neural Ordinary and Stochastic Differential Equations (NODEs/NSDEs). By explicitly enforcing the convergence of the latent phase point to a stable equilibrium state and mitigating feature collapse with a hyperspherical repulsion loss, the CMM provides a mathematically grounded and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Topic Modeling
