Rotational Unit of Memory

Rumen Dangovski; Li Jing; Marin Soljacic

arXiv:1710.09537·cs.LG·October 27, 2017·2 cites

Rotational Unit of Memory

Rumen Dangovski, Li Jing, Marin Soljacic

PDF

Open Access 2 Repos

TL;DR

The paper introduces Rotational Unit of Memory (RUM), a novel RNN architecture that uses rotational operations to enhance long-term memory manipulation, outperforming existing models on various sequential tasks.

Contribution

RUM unifies unitary matrices and associative memory within a single RNN model, improving long-term dependency learning without external attention mechanisms.

Findings

01

RUM fully learns the Copying Memory task.

02

RUM achieves state-of-the-art in Recall task.

03

RUM's performance in language modeling is comparable to attention-based models.

Abstract

The concepts of unitary evolution matrices and associative memory have boosted the field of Recurrent Neural Networks (RNN) to state-of-the-art performance in a variety of sequential tasks. However, RNN still have a limited capacity to manipulate long-term memory. To bypass this weakness the most successful applications of RNN use external techniques such as attention mechanisms. In this paper we propose a novel RNN model that unifies the state-of-the-art approaches: Rotational Unit of Memory (RUM). The core of RUM is its rotational operation, which is, naturally, a unitary matrix, providing architectures with the power to learn long-term dependencies by overcoming the vanishing and exploding gradients problem. Moreover, the rotational unit also serves as associative memory. We evaluate our model on synthetic memorization, question answering and language modeling tasks. RUM learns the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis