Go Forth and Prosper: Language Modeling with Ancient Textual History

Rik Koncel-Kedziorski; Noah A. Smith

arXiv:2104.08742·cs.CL·April 20, 2021

Go Forth and Prosper: Language Modeling with Ancient Textual History

Rik Koncel-Kedziorski, Noah A. Smith

PDF

Open Access 1 Repo

TL;DR

This paper presents a novel method to enhance document-level language models by selectively incorporating relevant historical text outside the current context window, significantly reducing perplexity across different domains without updating the model parameters.

Contribution

The authors introduce an auxiliary function to select and integrate historical text spans into the language model's context, improving performance without retraining the model.

Findings

01

7% perplexity reduction on Wikipedia articles

02

12% perplexity reduction on scientific texts

03

Method works across different textual domains

Abstract

We introduce a technique for improving document-level language models (LM) by leveraging "ancient history": text that is outside the LM's current context window. We learn an auxiliary function to select spans from the ancient history which can help the LM to predict future text. The selected text spans are then copied directly into the LM's context window, replacing less predictive spans. This method can improve perplexity of pretrained LMs with no updates to the LM's own parameters. We further observe that an auxiliary function trained in a specific textual domain like Wikipedia will also work in a substantially different domain such as scientific publications. With this technique we see a 7 percent perplexity reduction on Wikipedia articles, and a 12 percent perplexity reduction on scientific texts.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rikdz/AHLM
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications