User-Initiated Repetition-Based Recovery in Multi-Utterance Dialogue Systems
Hoang Long Nguyen, Vincent Renkens, Joris Pelemans, Srividya Pranavi, Potharaju, Anil Kumar Nalamalapu, Murat Akbacak

TL;DR
This paper introduces a system enabling users to correct speech recognition errors in virtual assistants by repeating misunderstood words, using a neural network to rewrite queries and improve understanding.
Contribution
It presents an end-to-end attention pointer network that effectively rewrites queries based on user repetitions, outperforming rule-based methods.
Findings
Reduces Word Error Rate by 19% relative at 2% false alarm rate
Outperforms rule-based baseline in query rewriting accuracy
Demonstrates effectiveness of repetition-based recovery in dialogue systems
Abstract
Recognition errors are common in human communication. Similar errors often lead to unwanted behaviour in dialogue systems or virtual assistants. In human communication, we can recover from them by repeating misrecognized words or phrases; however in human-machine communication this recovery mechanism is not available. In this paper, we attempt to bridge this gap and present a system that allows a user to correct speech recognition errors in a virtual assistant by repeating misunderstood words. When a user repeats part of the phrase the system rewrites the original query to incorporate the correction. This rewrite allows the virtual assistant to understand the original query successfully. We present an end-to-end 2-step attention pointer network that can generate the the rewritten query by merging together the incorrectly understood utterance with the correction follow-up. We evaluate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSoftmax · Tanh Activation · [LivE@PeRson]How do I talk to a real person at Expedia? · Sigmoid Activation · Long Short-Term Memory · Pointer Network
