LAD: Language Models as Data for Zero-Shot Dialog
Shikib Mehri, Yasemin Altun, Maxine Eskenazi

TL;DR
This paper introduces LAD, a method using GPT-3 to generate diverse synthetic data for zero-shot task-oriented dialog, significantly improving model performance without relying on human-labeled data.
Contribution
LAD presents a novel approach to generate synthetic dialog data with structural constraints using GPT-3, enhancing zero-shot dialog model training.
Findings
+15% intent prediction accuracy
+31.4 F-1 slot filling improvement
Training with LAD is competitive with human dialogs
Abstract
To facilitate zero-shot generalization in taskoriented dialog, this paper proposes Language Models as Data (LAD). LAD is a paradigm for creating diverse and accurate synthetic data which conveys the necessary structural constraints and can be used to train a downstream neural dialog model. LAD leverages GPT-3 to induce linguistic diversity. LAD achieves significant performance gains in zero-shot settings on intent prediction (+15%), slot filling (+31.4 F-1) and next action prediction (+11 F1). Furthermore, an interactive human evaluation shows that training with LAD is competitive with training on human dialogs. LAD is open-sourced, with the code and data available at https://github.com/Shikib/lad.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · AI in Service Interactions
Methods{Dispute@FaQ-s}How to file a dispute with Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Cosine Annealing · Weight Decay · Linear Warmup With Cosine Annealing · Residual Connection · Attention Dropout · Dropout
