Sparse Attentive Backtracking: Temporal CreditAssignment Through   Reminding

Nan Rosemary Ke; Anirudh Goyal; Olexa Bilaniuk; Jonathan Binas,; Michael C. Mozer; Chris Pal; Yoshua Bengio

arXiv:1809.03702·cs.LG·September 12, 2018·34 cites

Sparse Attentive Backtracking: Temporal CreditAssignment Through Reminding

Nan Rosemary Ke, Anirudh Goyal, Olexa Bilaniuk, Jonathan Binas,, Michael C. Mozer, Chris Pal, Yoshua Bengio

PDF

Open Access

TL;DR

Sparse Attentive Backtracking introduces a biologically inspired method for learning long-term dependencies in sequences by using attention-based memory associations, reducing computational costs compared to traditional backpropagation methods.

Contribution

The paper proposes a novel attention-based algorithm that enables credit assignment over long sequences without full backpropagation through time, inspired by human memory recall.

Findings

01

Outperforms BPTT and truncated BPTT on long-term dependency tasks.

02

Transfers better to longer sequences than LSTMs trained with BPTT or full self-attention.

03

Requires fewer backward passes, reducing computational complexity.

Abstract

Learning long-term dependencies in extended temporal sequences requires credit assignment to events far back in the past. The most common method for training recurrent neural networks, back-propagation through time (BPTT), requires credit information to be propagated backwards through every single step of the forward computation, potentially over thousands or millions of time steps. This becomes computationally expensive or even infeasible when used with long sequences. Importantly, biological brains are unlikely to perform such detailed reverse replay over very long sequences of internal states (consider days, months, or years.) However, humans are often reminded of past memories or mental states which are associated with the current mental state. We consider the hypothesis that such memory associations between past and present could be used for credit assignment through arbitrarily…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Reinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning