GOLD: Improving Out-of-Scope Detection in Dialogues using Data Augmentation
Derek Chen, Zhou Yu

TL;DR
GOLD is a data augmentation technique that significantly improves out-of-scope detection in dialogue systems, especially in low-data scenarios, by generating and filtering pseudo-labeled samples to enhance model training.
Contribution
The paper introduces GOLD, a novel data augmentation method that enhances OOS detection by generating and selecting beneficial pseudo-labeled data, outperforming existing approaches.
Findings
GOLD achieves over 50% relative improvement on key metrics.
It outperforms all existing methods across three benchmarks.
Analysis reveals key factors for effective OOS data augmentation.
Abstract
Practical dialogue systems require robust methods of detecting out-of-scope (OOS) utterances to avoid conversational breakdowns and related failure modes. Directly training a model with labeled OOS examples yields reasonable performance, but obtaining such data is a resource-intensive process. To tackle this limited-data problem, previous methods focus on better modeling the distribution of in-scope (INS) examples. We introduce GOLD as an orthogonal technique that augments existing data to train better OOS detectors operating in low-data regimes. GOLD generates pseudo-labeled candidates using samples from an auxiliary dataset and keeps only the most beneficial candidates for training through a novel filtering mechanism. In experiments across three target benchmarks, the top GOLD model outperforms all existing methods on all key metrics, achieving relative gains of 52.4%, 48.9% and 50.3%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Topic Modeling · Natural Language Processing Techniques
