TL;DR
This study investigates how training dialogue models with a semantic loss as an auxiliary objective affects response diversity, finding it improves diversity in smaller datasets but has limited impact on larger datasets, and explores the utility of large language model embeddings.
Contribution
It introduces the use of a semantic loss as an auxiliary training objective to enhance diversity in dialogue generation models, comparing its effects across different dataset sizes.
Findings
Semantic loss improves response diversity in small datasets.
Limited impact of semantic loss on larger datasets.
Large language model embeddings are more effective as semantic loss than as initialization.
Abstract
Neural models trained for next utterance generation in dialogue task learn to mimic the n-gram sequences in the training set with training objectives like negative log-likelihood (NLL) or cross-entropy. Such commonly used training objectives do not foster generating alternate responses to a context. But, the effects of minimizing an alternate training objective that fosters a model to generate alternate response and score it on semantic similarity has not been well studied. We hypothesize that a language generation model can improve on its diversity by learning to generate alternate text during training and minimizing a semantic loss as an auxiliary objective. We explore this idea on two different sized data sets on the task of next utterance generation in goal oriented dialogues. We make two observations (1) minimizing a semantic objective improved diversity in responses in the smaller…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
