Revisiting the Boundary between ASR and NLU in the Age of Conversational Dialog Systems
Manaal Faruqui, Dilek Hakkani-T\"ur

TL;DR
This paper discusses the evolving relationship between automatic speech recognition (ASR) and natural language understanding (NLU) in conversational systems, emphasizing integrated approaches and collaborative research to improve speech-based AI.
Contribution
It highlights the need for closer integration and mutual learning between ASR and NLU, proposing new directions for datasets and community collaboration.
Findings
NLU should consider upstream ASR errors
ASR models can learn from NLU errors
End-to-end datasets with semantic annotations are needed
Abstract
As more users across the world are interacting with dialog agents in their daily life, there is a need for better speech understanding that calls for renewed attention to the dynamics between research in automatic speech recognition (ASR) and natural language understanding (NLU). We briefly review these research areas and lay out the current relationship between them. In light of the observations we make in this paper, we argue that (1) NLU should be cognizant of the presence of ASR models being used upstream in a dialog system's pipeline, (2) ASR should be able to learn from errors found in NLU, (3) there is a need for end-to-end datasets that provide semantic annotations on spoken input, (4) there should be stronger collaboration between ASR and NLU research communities.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Natural Language Processing Techniques · Topic Modeling
