A Diversity-Promoting Objective Function for Neural Conversation Models

Jiwei Li; Michel Galley; Chris Brockett; Jianfeng Gao; Bill Dolan

arXiv:1510.03055·cs.CL·June 14, 2016·254 cites

A Diversity-Promoting Objective Function for Neural Conversation Models

Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, Bill Dolan

PDF

Open Access 5 Repos 1 Models

TL;DR

This paper introduces a new objective function based on Maximum Mutual Information for neural conversation models, significantly enhancing response diversity and relevance over traditional likelihood-based methods.

Contribution

It proposes using MMI as the training objective for neural conversational models, leading to more diverse and engaging responses compared to standard likelihood training.

Findings

01

MMI-based models generate more diverse responses.

02

Improved BLEU scores on conversational datasets.

03

Human evaluations favor MMI models for response quality.

Abstract

Sequence-to-sequence neural network models for generation of conversational responses tend to generate safe, commonplace responses (e.g., "I don't know") regardless of the input. We suggest that the traditional objective function, i.e., the likelihood of output (response) given input (message) is unsuited to response generation tasks. Instead we propose using Maximum Mutual Information (MMI) as the objective function in neural models. Experimental results demonstrate that the proposed MMI models produce more diverse, interesting, and appropriate responses, yielding substantive gains in BLEU scores on two conversational datasets and in human evaluations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

🤗
DyNin/carbot
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems