Memory Capacity of Recurrent Neural Networks with Matrix Representation
Animesh Renanse, Alok Sharma, Rohitash Chandra

TL;DR
This paper investigates the memory capacity of matrix-based RNNs and introduces Matrix NTMs, showing that matrix representations can enhance memory capacity and performance in algorithmic tasks compared to vector-based networks.
Contribution
It defines a probabilistic measure of memory capacity for matrix RNNs, derives bounds, and introduces Matrix NTMs with external memory, demonstrating improved performance over matrix RNNs.
Findings
Memory capacity of N×N matrix RNNs is bounded by N^2.
Matrix NTMs outperform Matrix RNNs in copying and recall tasks.
External memory increases the memory capacity and performance of matrix-based RNNs.
Abstract
It is well known that canonical recurrent neural networks (RNNs) face limitations in learning long-term dependencies which have been addressed by memory structures in long short-term memory (LSTM) networks. Neural Turing machines (NTMs) are novel RNNs that implement the notion of programmable computers with neural network controllers that can learn simple algorithmic tasks. Matrix neural networks feature matrix representation which inherently preserves the spatial structure of data when compared to canonical neural networks that use vector-based representation. One may then argue that neural networks with matrix representations may have the potential to provide better memory capacity. In this paper, we define and study a probabilistic notion of memory capacity based on Fisher information for matrix-based RNNs. We find bounds on memory capacity for such networks under various hypotheses…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Machine Learning and ELM · Ferroelectric and Negative Capacitance Devices
