OneNet: Joint Domain, Intent, Slot Prediction for Spoken Language Understanding
Young-Bum Kim, Sungjin Lee, Karl Stratos

TL;DR
This paper introduces OneNet, a unified neural network model that jointly predicts domain, intent, and slots in spoken language understanding, reducing errors and improving performance over traditional pipeline systems.
Contribution
The paper presents a novel multitask learning architecture that integrates domain, intent, and slot prediction into a single model, enhancing accuracy on real user data.
Findings
Significant improvements in all three tasks over strong baselines
Effective use of orthography-sensitive encoding and curriculum training
Outperforms models with oracle domain prediction
Abstract
In practice, most spoken language understanding systems process user input in a pipelined manner; first domain is predicted, then intent and semantic slots are inferred according to the semantic frames of the predicted domain. The pipeline approach, however, has some disadvantages: error propagation and lack of information sharing. To address these issues, we present a unified neural network that jointly performs domain, intent, and slot predictions. Our approach adopts a principled architecture for multitask learning to fold in the state-of-the-art models for each task. With a few more ingredients, e.g. orthography-sensitive input encoding and curriculum training, our model delivered significant improvements in all three tasks across all domains over strong baselines, including one using oracle prediction for domain detection, on real user data of a commercial personal assistant.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Speech and dialogue systems
