Adapting Text-based Dialogue State Tracker for Spoken Dialogues
Jaeseok Yoon, Seunghyun Hwang, Ran Han, Jeonguk Bang, Kee-Eung Kim

TL;DR
This paper presents an engineering approach to adapt text-based dialogue state tracking models for spoken dialogues by incorporating speech recognition error correction, post-processing, and data augmentation, demonstrating improved robustness.
Contribution
The paper introduces a practical system combining error correction, post-processing, and data augmentation to effectively transfer text-based dialogue state tracking to spoken dialogue scenarios.
Findings
Explicit speech recognition error correction improves accuracy.
Post-processing enhances slot value estimation.
Data augmentation aids in adapting models to spoken dialogue data.
Abstract
Although there have been remarkable advances in dialogue systems through the dialogue systems technology competition (DSTC), it remains one of the key challenges to building a robust task-oriented dialogue system with a speech interface. Most of the progress has been made for text-based dialogue systems since there are abundant datasets with written corpora while those with spoken dialogues are very scarce. However, as can be seen from voice assistant systems such as Siri and Alexa, it is of practical importance to transfer the success to spoken dialogues. In this paper, we describe our engineering effort in building a highly successful model that participated in the speech-aware dialogue systems technology challenge track in DSTC11. Our model consists of three major modules: (1) automatic speech recognition error correction to bridge the gap between the spoken and the text utterances,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Topic Modeling · Natural Language Processing Techniques
