Feed-Forward Networks with Attention Can Solve Some Long-Term Memory   Problems

Colin Raffel; Daniel P. W. Ellis

arXiv:1512.08756·cs.LG·September 21, 2016·273 cites

Feed-Forward Networks with Attention Can Solve Some Long-Term Memory Problems

Colin Raffel, Daniel P. W. Ellis

PDF

Open Access 5 Repos

TL;DR

This paper introduces a simplified attention-based model for feed-forward neural networks that effectively solves long-term memory tasks, outperforming previous results on synthetic addition and multiplication problems.

Contribution

The paper presents a novel simplified attention mechanism integrated into feed-forward networks, enabling them to handle longer and more variable sequence lengths for memory tasks.

Findings

01

Successfully solves long-term memory problems with longer sequences

02

Outperforms previous models on synthetic addition and multiplication tasks

03

Demonstrates the effectiveness of simplified attention in feed-forward architectures

Abstract

We propose a simplified model of attention which is applicable to feed-forward neural networks and demonstrate that the resulting model can solve the synthetic "addition" and "multiplication" long-term memory problems for sequence lengths which are both longer and more widely varying than the best published results for these tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and ELM · Advanced Neural Network Applications