Towards Better Understanding of Spontaneous Conversations: Overcoming Automatic Speech Recognition Errors With Intent Recognition
Piotr \.Zelasko, Jan Mizgajski, Miko{\l}aj Morzy, Adrian Szymczak,, Piotr Szyma\'nski, {\L}ukasz Augustyniak, Yishay Carmiel

TL;DR
This paper introduces a novel FST-based intent recognition method to correct ASR errors in spontaneous human conversations, significantly improving intent detection despite speech disfluencies and limited labeled data.
Contribution
The paper presents a new FST framework and fuzzy search algorithm for intent recognition that enhances transcript accuracy and insight extraction in spontaneous dialogues.
Findings
Increased intent recognition by 25% over baseline
Effective handling of disfluencies and speech errors
Improved transcript rescoring using intent indexing
Abstract
In this paper, we present a method for correcting automatic speech recognition (ASR) errors using a finite state transducer (FST) intent recognition framework. Intent recognition is a powerful technique for dialog flow management in turn-oriented, human-machine dialogs. This technique can also be very useful in the context of human-human dialogs, though it serves a different purpose of key insight extraction from conversations. We argue that currently available intent recognition techniques are not applicable to human-human dialogs due to the complex structure of turn-taking and various disfluencies encountered in spontaneous conversations, exacerbated by speech recognition errors and scarcity of domain-specific labeled data. Without efficient key insight extraction techniques, raw human-human dialog transcripts remain significantly unexploited. Our contribution consists of a novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Natural Language Processing Techniques · Topic Modeling
MethodsPruning
