Architectural Proprioception in State Space Models: Thermodynamic Training Induces Anticipatory Halt Detection
Jay Noon

TL;DR
This paper demonstrates that thermodynamic training of State Space Models induces architectural proprioception, enabling anticipatory halt detection that is architecture-dependent and transferable across tasks, with implications for efficient and self-aware neural computation.
Contribution
The study introduces a thermodynamic loss function for training SSMs, revealing a universal anticipatory halt detection mechanism absent in Transformers, and shows its controllability and transferability across tasks.
Findings
Thermodynamically-trained SSMs develop anticipatory halt signals.
Transformers do not exhibit the same coupling, indicating architecture dependence.
The anticipatory coupling is controllable via training parameters and generalizes across tasks.
Abstract
We introduce the Probability Navigation Architecture (PNA) framework, which treats neural computation as navigation through a probability manifold governed by thermodynamic principles. We train State Space Models (SSMs) and Transformers with a novel thermodynamic loss function that penalizes computational waste alongside standard cross-entropy. Across 19 experimental phases, we discover that thermodynamically-trained SSMs develop architectural proprioception: a strong anticipatory coupling between recurrent state entropy and halt confidence (r = -0.836, p < 0.001) in which the halt signal leads state entropy collapse by exactly two tokens (tau = -2.0). This Universal Stopping Signature (USS) reproduces to four decimal places across random seeds and generalizes to a structurally distinct sorting task. Critically, Transformers trained identically show no such coupling (r = -0.07),…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · Generative Adversarial Networks and Image Synthesis · EEG and Brain-Computer Interfaces
