On joint training with interfaces for spoken language understanding
Anirudh Raju, Milind Rao, Gautam Tiwari, Pranav Dheram, Bryan, Anderson, Zhe Zhang, Chul Lee, Bach Bui, Ariya Rastrow

TL;DR
This paper investigates how different interfaces between ASR and NLU modules affect joint training in spoken language understanding systems, achieving state-of-the-art results on the SLURP dataset.
Contribution
It introduces methods for joint training using both text and neural interfaces, demonstrating improved performance and analyzing the impact of pretrained models and data size.
Findings
State-of-the-art results on SLURP dataset
Neural interfaces outperform text interfaces in joint training
Pretrained models' impact diminishes with more training data
Abstract
Spoken language understanding (SLU) systems extract both text transcripts and semantics associated with intents and slots from input speech utterances. SLU systems usually consist of (1) an automatic speech recognition (ASR) module, (2) an interface module that exposes relevant outputs from ASR, and (3) a natural language understanding (NLU) module. Interfaces in SLU systems carry information on text transcriptions or richer information like neural embeddings from ASR to NLU. In this paper, we study how interfaces affect joint-training for spoken language understanding. Most notably, we obtain the state-of-the-art results on the publicly available 50-hr SLURP dataset. We first leverage large-size pretrained ASR and NLU models that are connected by a text interface, and then jointly train both models via a sequence loss function. For scenarios where pretrained models are not utilized,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
