Feed-Forward Networks with Attention Can Solve Some Long-Term Memory Problems
Colin Raffel, Daniel P. W. Ellis

TL;DR
This paper introduces a simplified attention-based model for feed-forward neural networks that effectively solves long-term memory tasks, outperforming previous results on synthetic addition and multiplication problems.
Contribution
The paper presents a novel simplified attention mechanism integrated into feed-forward networks, enabling them to handle longer and more variable sequence lengths for memory tasks.
Findings
Successfully solves long-term memory problems with longer sequences
Outperforms previous models on synthetic addition and multiplication tasks
Demonstrates the effectiveness of simplified attention in feed-forward architectures
Abstract
We propose a simplified model of attention which is applicable to feed-forward neural networks and demonstrate that the resulting model can solve the synthetic "addition" and "multiplication" long-term memory problems for sequence lengths which are both longer and more widely varying than the best published results for these tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and ELM · Advanced Neural Network Applications
