OodGAN: Generative Adversarial Network for Out-of-Domain Data Generation

Petr Marek; Vishal Ishwar Naik; Vincent Auvray; Anuj Goyal

arXiv:2104.02484·cs.CL·April 7, 2021

OodGAN: Generative Adversarial Network for Out-of-Domain Data Generation

Petr Marek, Vishal Ishwar Naik, Vincent Auvray, Anuj Goyal

PDF

TL;DR

OodGAN introduces a sequential GAN model that directly generates out-of-domain text data, simplifying the architecture and improving OOD detection performance in dialog systems.

Contribution

The paper presents a novel SeqGAN-based model for direct text generation of OOD data, removing auto-encoder components and enhancing training simplicity.

Findings

01

Outperforms state-of-the-art in OOD detection metrics

02

Achieves 67% relative improvement in FPR 0.95 on ROSTD

03

Achieves 28% relative improvement in FPR 0.95 on OSQ datasets

Abstract

Detecting an Out-of-Domain (OOD) utterance is crucial for a robust dialog system. Most dialog systems are trained on a pool of annotated OOD data to achieve this goal. However, collecting the annotated OOD data for a given domain is an expensive process. To mitigate this issue, previous works have proposed generative adversarial networks (GAN) based models to generate OOD data for a given domain automatically. However, these proposed models do not work directly with the text. They work with the text's latent space instead, enforcing these models to include components responsible for encoding text into latent space and decoding it back, such as auto-encoder. These components increase the model complexity, making it difficult to train. We propose OodGAN, a sequential generative adversarial network (SeqGAN) based model for OOD data generation. Our proposed model works directly on the text…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.