State-Space Constraints Can Improve the Generalisation of the Differentiable Neural Computer to Input Sequences With Unseen Length
Patrick Ofner, Roman Kern

TL;DR
This paper shows that constraining the state space of a differentiable neural computer improves its ability to generalize to longer, unseen input sequences, enabling more efficient training and extended memory use.
Contribution
It introduces state compression and regularisation techniques that enhance generalisation of DNCs to longer sequences without retraining, highlighting the importance of structured state spaces.
Findings
Constrained DNCs process sequences up to 2.3 times longer than baseline.
Constraints enable extending memory matrix processing by 10.4 times.
Structured state spaces correlate with better task performance.
Abstract
Memory-augmented neural networks (MANNs) can perform algorithmic tasks such as sorting. However, they often fail to generalise to input sequence lengths not encountered during training. We introduce two approaches that constrain the state space of the MANN's controller network: state compression and state regularisation. We empirically demonstrated that both approaches can improve generalisation to input sequences of out-of-distribution lengths for a specific type of MANN: the differentiable neural computer (DNC). The constrained DNC could process input sequences that were up to 2.3 times longer than those processed by an unconstrained baseline controller network. Notably, the applied constraints enabled the extension of the DNC's memory matrix without the need for retraining and thus allowed the processing of input sequences that were 10.4 times longer. However, the improvements were…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Advanced Memory and Neural Computing · Machine Learning and ELM
MethodsHigh-Order Consensuses
