GOLD: Improving Out-of-Scope Detection in Dialogues using Data   Augmentation

Derek Chen; Zhou Yu

arXiv:2109.03079·cs.CL·September 8, 2021

GOLD: Improving Out-of-Scope Detection in Dialogues using Data Augmentation

Derek Chen, Zhou Yu

PDF

Open Access 2 Repos

TL;DR

GOLD is a data augmentation technique that significantly improves out-of-scope detection in dialogue systems, especially in low-data scenarios, by generating and filtering pseudo-labeled samples to enhance model training.

Contribution

The paper introduces GOLD, a novel data augmentation method that enhances OOS detection by generating and selecting beneficial pseudo-labeled data, outperforming existing approaches.

Findings

01

GOLD achieves over 50% relative improvement on key metrics.

02

It outperforms all existing methods across three benchmarks.

03

Analysis reveals key factors for effective OOS data augmentation.

Abstract

Practical dialogue systems require robust methods of detecting out-of-scope (OOS) utterances to avoid conversational breakdowns and related failure modes. Directly training a model with labeled OOS examples yields reasonable performance, but obtaining such data is a resource-intensive process. To tackle this limited-data problem, previous methods focus on better modeling the distribution of in-scope (INS) examples. We introduce GOLD as an orthogonal technique that augments existing data to train better OOS detectors operating in low-data regimes. GOLD generates pseudo-labeled candidates using samples from an auxiliary dataset and keeps only the most beneficial candidates for training through a novel filtering mechanism. In experiments across three target benchmarks, the top GOLD model outperforms all existing methods on all key metrics, achieving relative gains of 52.4%, 48.9% and 50.3%…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Topic Modeling · Natural Language Processing Techniques