Curriculum-Based Neighborhood Sampling For Sequence Prediction

James O' Neill; Danushka Bollegala

arXiv:1809.05916·cs.LG·September 18, 2018

Curriculum-Based Neighborhood Sampling For Sequence Prediction

James O' Neill, Danushka Bollegala

PDF

Open Access

TL;DR

This paper introduces a curriculum learning approach called Nearest-Neighbor Replacement Sampling to reduce exposure bias in language models, improving multi-step prediction accuracy by gradually introducing stochasticity during training.

Contribution

It proposes a novel curriculum learning method that replaces inputs with similar neighbors to better handle exposure bias in sequence prediction models.

Findings

01

Improves performance on language modeling benchmarks.

02

Works well with scheduled sampling to reduce compounding errors.

03

Requires minimal additional memory.

Abstract

The task of multi-step ahead prediction in language models is challenging considering the discrepancy between training and testing. At test time, a language model is required to make predictions given past predictions as input, instead of the past targets that are provided during training. This difference, known as exposure bias, can lead to the compounding of errors along a generated sequence at test time. In order to improve generalization in neural language models and address compounding errors, we propose a curriculum learning based method that gradually changes an initially deterministic teacher policy to a gradually more stochastic policy, which we refer to as \textit{Nearest-Neighbor Replacement Sampling}. A chosen input at a given timestep is replaced with a sampled nearest neighbor of the past target with a truncated probability proportional to the cosine similarity between…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification