LAD: Language Models as Data for Zero-Shot Dialog

Shikib Mehri; Yasemin Altun; Maxine Eskenazi

arXiv:2207.14393·cs.CL·August 1, 2022·1 cites

LAD: Language Models as Data for Zero-Shot Dialog

Shikib Mehri, Yasemin Altun, Maxine Eskenazi

PDF

Open Access

TL;DR

This paper introduces LAD, a method using GPT-3 to generate diverse synthetic data for zero-shot task-oriented dialog, significantly improving model performance without relying on human-labeled data.

Contribution

LAD presents a novel approach to generate synthetic dialog data with structural constraints using GPT-3, enhancing zero-shot dialog model training.

Findings

01

+15% intent prediction accuracy

02

+31.4 F-1 slot filling improvement

03

Training with LAD is competitive with human dialogs

Abstract

To facilitate zero-shot generalization in taskoriented dialog, this paper proposes Language Models as Data (LAD). LAD is a paradigm for creating diverse and accurate synthetic data which conveys the necessary structural constraints and can be used to train a downstream neural dialog model. LAD leverages GPT-3 to induce linguistic diversity. LAD achieves significant performance gains in zero-shot settings on intent prediction (+15%), slot filling (+31.4 F-1) and next action prediction (+11 F1). Furthermore, an interactive human evaluation shows that training with LAD is competitive with training on human dialogs. LAD is open-sourced, with the code and data available at https://github.com/Shikib/lad.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · AI in Service Interactions

Methods{Dispute@FaQ-s}How to file a dispute with Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Cosine Annealing · Weight Decay · Linear Warmup With Cosine Annealing · Residual Connection · Attention Dropout · Dropout