Data Augmentation for Conversational AI

Heydar Soudani; Evangelos Kanoulas; Faegheh Hasibi

arXiv:2309.04739·cs.CL·March 5, 2024

Data Augmentation for Conversational AI

Heydar Soudani, Evangelos Kanoulas, Faegheh Hasibi

PDF

1 Repo

TL;DR

This paper reviews data augmentation techniques for conversational AI, emphasizing their importance in low-resource settings, recent advances, evaluation methods, challenges, and future research directions.

Contribution

It provides a comprehensive overview of recent data augmentation approaches in conversational systems, highlighting new methods and evaluation paradigms.

Findings

01

Recent advances in conversation augmentation techniques

02

Open domain and task-oriented conversation generation methods

03

Discussion of challenges and future research directions

Abstract

Advancements in conversational systems have revolutionized information access, surpassing the limitations of single queries. However, developing dialogue systems requires a large amount of training data, which is a challenge in low-resource domains and languages. Traditional data collection methods like crowd-sourcing are labor-intensive and time-consuming, making them ineffective in this context. Data augmentation (DA) is an affective approach to alleviate the data scarcity problem in conversational systems. This tutorial provides a comprehensive and up-to-date overview of DA approaches in the context of conversational systems. It highlights recent advances in conversation augmentation, open domain and task-oriented conversation generation, and different paradigms of evaluating these models. We also discuss current challenges and future directions in order to help researchers and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dataug-convai/dataug-convai.github.io
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.