Counterfactual Data Augmentation via Perspective Transition for   Open-Domain Dialogues

Jiao Ou; Jinchao Zhang; Yang Feng; Jie Zhou

arXiv:2210.16838·cs.CL·November 1, 2022

Counterfactual Data Augmentation via Perspective Transition for Open-Domain Dialogues

Jiao Ou, Jinchao Zhang, Yang Feng, Jie Zhou

PDF

Open Access 1 Repo

TL;DR

This paper introduces a counterfactual data augmentation approach for open-domain dialogues that generates semantically diverse responses to improve dialogue system training, outperforming baselines.

Contribution

It proposes a novel counterfactual inference method to automatically augment dialogue datasets with diverse responses, reducing manual data collection efforts.

Findings

01

Augmented datasets improve downstream dialogue task performance.

02

The method generates semantically varied responses effectively.

03

Outperforms existing baselines in experiments.

Abstract

The construction of open-domain dialogue systems requires high-quality dialogue datasets. The dialogue data admits a wide variety of responses for a given dialogue history, especially responses with different semantics. However, collecting high-quality such a dataset in most scenarios is labor-intensive and time-consuming. In this paper, we propose a data augmentation method to automatically augment high-quality responses with different semantics by counterfactual inference. Specifically, given an observed dialogue, our counterfactual generation model first infers semantically different responses by replacing the observed reply perspective with substituted ones. Furthermore, our data selection method filters out detrimental augmented responses. Experimental results show that our data augmentation method can augment high-quality responses with different semantics for a given dialogue…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ictnlp/capt
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques