OodGAN: Generative Adversarial Network for Out-of-Domain Data Generation
Petr Marek, Vishal Ishwar Naik, Vincent Auvray, Anuj Goyal

TL;DR
OodGAN introduces a sequential GAN model that directly generates out-of-domain text data, simplifying the architecture and improving OOD detection performance in dialog systems.
Contribution
The paper presents a novel SeqGAN-based model for direct text generation of OOD data, removing auto-encoder components and enhancing training simplicity.
Findings
Outperforms state-of-the-art in OOD detection metrics
Achieves 67% relative improvement in FPR 0.95 on ROSTD
Achieves 28% relative improvement in FPR 0.95 on OSQ datasets
Abstract
Detecting an Out-of-Domain (OOD) utterance is crucial for a robust dialog system. Most dialog systems are trained on a pool of annotated OOD data to achieve this goal. However, collecting the annotated OOD data for a given domain is an expensive process. To mitigate this issue, previous works have proposed generative adversarial networks (GAN) based models to generate OOD data for a given domain automatically. However, these proposed models do not work directly with the text. They work with the text's latent space instead, enforcing these models to include components responsible for encoding text into latent space and decoding it back, such as auto-encoder. These components increase the model complexity, making it difficult to train. We propose OodGAN, a sequential generative adversarial network (SeqGAN) based model for OOD data generation. Our proposed model works directly on the text…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
