Dynamic Evaluation of Neural Sequence Models

Ben Krause; Emmanuel Kahembwe; Iain Murray; Steve Renals

arXiv:1709.07432·cs.NE·October 27, 2017·60 cites

Dynamic Evaluation of Neural Sequence Models

Ben Krause, Emmanuel Kahembwe, Iain Murray, Steve Renals

PDF

Open Access 3 Repos

TL;DR

This paper introduces a dynamic evaluation method that adapts neural sequence models in real-time using gradient descent, significantly enhancing their ability to predict sequential data and outperforming existing adaptation techniques.

Contribution

It proposes a novel dynamic evaluation approach that improves neural sequence models by adapting to recent data, setting new state-of-the-art results on multiple datasets.

Findings

01

Achieved state-of-the-art perplexities on Penn Treebank and WikiText-2.

02

Reduced cross-entropy on text8 and Hutter Prize datasets.

03

Demonstrated superior performance over existing adaptation methods.

Abstract

We present methodology for using dynamic evaluation to improve neural sequence models. Models are adapted to recent history via a gradient descent based mechanism, causing them to assign higher probabilities to re-occurring sequential patterns. Dynamic evaluation outperforms existing adaptation approaches in our comparisons. Dynamic evaluation improves the state-of-the-art word-level perplexities on the Penn Treebank and WikiText-2 datasets to 51.1 and 44.3 respectively, and the state-of-the-art character-level cross-entropies on the text8 and Hutter Prize datasets to 1.19 bits/char and 1.08 bits/char respectively.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Neural Networks and Applications · Natural Language Processing Techniques