Unsupervised Slot Schema Induction for Task-oriented Dialog
Dian Yu, Mingqiu Wang, Yuan Cao, Izhak Shafran, Laurent El Shafey,, Hagen Soltau

TL;DR
This paper introduces an unsupervised method for inducing slot schemas from unlabeled dialog data, reducing the need for manual schema design in task-oriented dialog systems.
Contribution
It presents a novel data-driven approach leveraging language models and clustering to automatically induce slot schemas without supervision.
Findings
Significant improvement over supervised baselines in schema induction accuracy.
Effective schemas enhance dialog state tracking and response generation.
Method applicable to multiple dialog datasets like MultiWoz and SGD.
Abstract
Carefully-designed schemas describing how to collect and annotate dialog corpora are a prerequisite towards building task-oriented dialog systems. In practical applications, manually designing schemas can be error-prone, laborious, iterative, and slow, especially when the schema is complicated. To alleviate this expensive and time consuming process, we propose an unsupervised approach for slot schema induction from unlabeled dialog corpora. Leveraging in-domain language models and unsupervised parsing structures, our data-driven approach extracts candidate slots without constraints, followed by coarse-to-fine clustering to induce slot types. We compare our method against several strong supervised baselines, and show significant performance improvement in slot schema induction on MultiWoz and SGD datasets. We also demonstrate the effectiveness of induced schemas on downstream…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
MethodsStochastic Gradient Descent
