Dual Task Framework for Improving Persona-grounded Dialogue Dataset
Minju Kim, Beong-woo Kwak, Youngwook Kim, Hong-in Lee, Seung-won, Hwang, Jinyoung Yeo

TL;DR
This paper presents a data-centric method that enhances persona-grounded dialogue datasets by fixing annotation artifacts through a dual task framework, leading to improved dialogue agent performance.
Contribution
It introduces a novel augmentation approach leveraging primal-dual task structure to fix dataset artifacts, improving dialogue model accuracy.
Findings
Outperforms pre-trained LMs by 11.7 points in accuracy on Persona-Chat
Effectively fixes annotation artifacts in dialogue datasets
Applicable to any dialogue model regardless of architecture
Abstract
This paper introduces a simple yet effective data-centric approach for the task of improving persona-conditioned dialogue agents. Prior model-centric approaches unquestioningly depend on the raw crowdsourced benchmark datasets such as Persona-Chat. In contrast, we aim to fix annotation artifacts in benchmarking, which is orthogonally applicable to any dialogue model. Specifically, we augment relevant personas to improve dialogue dataset/agent, by leveraging the primal-dual structure of the two tasks, predicting dialogue responses and personas based on each other. Experiments on Persona-Chat show that our approach outperforms pre-trained LMs by an 11.7 point gain in terms of accuracy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAI in Service Interactions · Topic Modeling · Persona Design and Applications
