Dialogue Response Ranking Training with Large-Scale Human Feedback Data

Xiang Gao; Yizhe Zhang; Michel Galley; Chris Brockett; Bill Dolan

arXiv:2009.06978·cs.CL·September 16, 2020·6 cites

Dialogue Response Ranking Training with Large-Scale Human Feedback Data

Xiang Gao, Yizhe Zhang, Michel Galley, Chris Brockett, Bill Dolan

PDF

Open Access 2 Repos 7 Models

TL;DR

This paper introduces DialogRPT, a large-scale feedback prediction model trained on 133 million human feedback pairs, which improves response ranking in open-domain dialogue systems by better aligning with human preferences.

Contribution

The paper presents a novel large-scale training dataset and a GPT-2 based ranking model that outperforms baselines in predicting engaging responses and correlates well with human preferences.

Findings

01

DialogRPT outperforms baselines in feedback prediction.

02

The model correlates better with human preferences.

03

Combining feedback and scoring models improves response ranking.

Abstract

Existing open-domain dialog models are generally trained to minimize the perplexity of target human responses. However, some human replies are more engaging than others, spawning more followup interactions. Current conversational models are increasingly capable of producing turns that are context-relevant, but in order to produce compelling agents, these models need to be able to predict and optimize for turns that are genuinely engaging. We leverage social media feedback data (number of replies and upvotes) to build a large-scale training dataset for feedback prediction. To alleviate possible distortion between the feedback and engagingness, we convert the ranking problem to a comparison of response pairs which involve few confounding factors. We trained DialogRPT, a set of GPT-2 based models on 133M pairs of human feedback data and the resulting ranker outperformed several baselines.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques

MethodsLinear Layer · Cosine Annealing · Layer Normalization · Weight Decay · Dropout · Dense Connections · Linear Warmup With Cosine Annealing · Attention Dropout · Byte Pair Encoding · Multi-Head Attention