Bilevel Scheduled Sampling for Dialogue Generation

Jiawen Liu; Kan Li

arXiv:2309.01953·cs.CL·September 6, 2023

Bilevel Scheduled Sampling for Dialogue Generation

Jiawen Liu, Kan Li

PDF

Open Access

TL;DR

This paper introduces a bilevel scheduled sampling approach for dialogue generation that considers sentence-level information alongside word-level quality, improving diversity and reducing exposure bias.

Contribution

It proposes a novel bilevel scheduled sampling method that integrates sentence-level and word-level information with a smooth mapping function for better dialogue generation.

Findings

01

Significantly reduces exposure bias in dialogue models

02

Outperforms existing scheduled sampling methods on DailyDialog and PersonaChat datasets

03

Enhances sampling diversity and model adaptability

Abstract

Exposure bias poses a common challenge in numerous natural language processing tasks, particularly in the dialog generation. In response to this issue, researchers have devised various techniques, among which scheduled sampling has proven to be an effective method for mitigating exposure bias. However, the existing state-of-the-art scheduled sampling methods solely consider the current sampling words' quality for threshold truncation sampling, which overlooks the importance of sentence-level information and the method of threshold truncation warrants further discussion. In this paper, we propose a bilevel scheduled sampling model that takes the sentence-level information into account and incorporates it with word-level quality. To enhance sampling diversity and improve the model's adaptability, we propose a smooth function that maps the combined result of sentence-level and word-level…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques