Bilevel Scheduled Sampling for Dialogue Generation
Jiawen Liu, Kan Li

TL;DR
This paper introduces a bilevel scheduled sampling approach for dialogue generation that considers sentence-level information alongside word-level quality, improving diversity and reducing exposure bias.
Contribution
It proposes a novel bilevel scheduled sampling method that integrates sentence-level and word-level information with a smooth mapping function for better dialogue generation.
Findings
Significantly reduces exposure bias in dialogue models
Outperforms existing scheduled sampling methods on DailyDialog and PersonaChat datasets
Enhances sampling diversity and model adaptability
Abstract
Exposure bias poses a common challenge in numerous natural language processing tasks, particularly in the dialog generation. In response to this issue, researchers have devised various techniques, among which scheduled sampling has proven to be an effective method for mitigating exposure bias. However, the existing state-of-the-art scheduled sampling methods solely consider the current sampling words' quality for threshold truncation sampling, which overlooks the importance of sentence-level information and the method of threshold truncation warrants further discussion. In this paper, we propose a bilevel scheduled sampling model that takes the sentence-level information into account and incorporates it with word-level quality. To enhance sampling diversity and improve the model's adaptability, we propose a smooth function that maps the combined result of sentence-level and word-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques
