Episodic Memory Theory for the Mechanistic Interpretation of Recurrent Neural Networks
Arjun Karuvally, Peter Delmastro, Hava T. Siegelmann

TL;DR
This paper introduces the Episodic Memory Theory (EMT) to interpret RNNs as analogs of episodic memory models, providing a rigorous framework and empirical evidence for variable binding and interpretability in RNNs.
Contribution
It proposes EMT as a new theoretical framework, introduces novel algorithms for variable binding, and reveals hidden neurons crucial for temporal variable storage in RNNs.
Findings
Trained RNNs converge to the variable binding circuit
Privileged basis enhances interpretability of RNNs
RNNs exhibit universality in variable binding dynamics
Abstract
Understanding the intricate operations of Recurrent Neural Networks (RNNs) mechanistically is pivotal for advancing their capabilities and applications. In this pursuit, we propose the Episodic Memory Theory (EMT), illustrating that RNNs can be conceptualized as discrete-time analogs of the recently proposed General Sequential Episodic Memory Model. To substantiate EMT, we introduce a novel set of algorithmic tasks tailored to probe the variable binding behavior in RNNs. Utilizing the EMT, we formulate a mathematically rigorous circuit that facilitates variable binding in these tasks. Our empirical investigations reveal that trained RNNs consistently converge to the variable binding circuit, thus indicating universality in the dynamics of RNNs. Building on these findings, we devise an algorithm to define a privileged basis, which reveals hidden neurons instrumental in the temporal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Neural dynamics and brain function · Advanced Memory and Neural Computing
