A Brief Study on the Effects of Training Generative Dialogue Models with   a Semantic loss

Prasanna Parthasarathi; Mohamed Abdelsalam; Joelle Pineau; Sarath; Chandar

arXiv:2106.10619·cs.CL·June 22, 2021

A Brief Study on the Effects of Training Generative Dialogue Models with a Semantic loss

Prasanna Parthasarathi, Mohamed Abdelsalam, Joelle Pineau, Sarath, Chandar

PDF

1 Repo

TL;DR

This study investigates how training dialogue models with a semantic loss as an auxiliary objective affects response diversity, finding it improves diversity in smaller datasets but has limited impact on larger datasets, and explores the utility of large language model embeddings.

Contribution

It introduces the use of a semantic loss as an auxiliary training objective to enhance diversity in dialogue generation models, comparing its effects across different dataset sizes.

Findings

01

Semantic loss improves response diversity in small datasets.

02

Limited impact of semantic loss on larger datasets.

03

Large language model embeddings are more effective as semantic loss than as initialization.

Abstract

Neural models trained for next utterance generation in dialogue task learn to mimic the n-gram sequences in the training set with training objectives like negative log-likelihood (NLL) or cross-entropy. Such commonly used training objectives do not foster generating alternate responses to a context. But, the effects of minimizing an alternate training objective that fosters a model to generate alternate response and score it on semantic similarity has not been well studied. We hypothesize that a language generation model can improve on its diversity by learning to generate alternate text during training and minimizing a semantic loss as an auxiliary objective. We explore this idea on two different sized data sets on the task of next utterance generation in goal oriented dialogues. We make two observations (1) minimizing a semantic objective improved diversity in responses in the smaller…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ppartha03/Semantic-Loss-Dialogue-Generation
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.