Mechanistic Interpretability of RNNs emulating Hidden Markov Models
Elia Torre, Michele Viscione, Lucas Pompe, Benjamin F Grewe, Valerio Mante

TL;DR
This paper demonstrates how RNNs can emulate Hidden Markov Models by developing structured dynamics and specific neurons that facilitate stochastic transitions, revealing insights into their interpretability and computational mechanisms.
Contribution
It uncovers the mechanisms by which RNNs replicate HMM-like stochastic dynamics, including the role of 'kick neurons' and the emergence of structured connectivity during training.
Findings
RNNs can replicate HMM emission statistics.
Transitions are governed by slow, noise-driven dynamics and fast deterministic shifts.
Structured connectivity with 'kick neurons' facilitates probabilistic computations.
Abstract
Recurrent neural networks (RNNs) provide a powerful approach in neuroscience to infer latent dynamics in neural populations and to generate hypotheses about the neural computations underlying behavior. However, past work has focused on relatively simple, input-driven, and largely deterministic behaviors - little is known about the mechanisms that would allow RNNs to generate the richer, spontaneous, and potentially stochastic behaviors observed in natural settings. Modeling with Hidden Markov Models (HMMs) has revealed a segmentation of natural behaviors into discrete latent states with stochastic transitions between them, a type of dynamics that may appear at odds with the continuous state spaces implemented by RNNs. Here we first show that RNNs can replicate HMM emission statistics and then reverse-engineer the trained networks to uncover the mechanisms they implement. In the absence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
