An Efficient Self-Learning Framework For Interactive Spoken Dialog Systems
Hitesh Tulsiani, David M. Chan, Shalini Ghosh, Garima Lalwani, Prabhat, Pandey, Ankish Bansal, Sri Garimella, Ariya Rastrow, Bj\"orn Hoffmeister

TL;DR
This paper presents a novel self-learning framework for dialog system ASR that adapts over time using user feedback and context, significantly reducing word error rates in real-world and synthetic datasets.
Contribution
The work introduces a general, context-aware self-learning framework leveraging student-teacher models and contrastive self-supervision for improved dialog ASR.
Findings
Near 10% relative WER reduction in real-world systems
Up to 26% WER reduction on synthetic data
Effective adaptation to multi-turn conversations
Abstract
Dialog systems, such as voice assistants, are expected to engage with users in complex, evolving conversations. Unfortunately, traditional automatic speech recognition (ASR) systems deployed in such applications are usually trained to recognize each turn independently and lack the ability to adapt to the conversational context or incorporate user feedback. In this work, we introduce a general framework for ASR in dialog systems that can go beyond learning from single-turn utterances and learn over time how to adapt to both explicit supervision and implicit user feedback present in multi-turn conversations. We accomplish that by leveraging advances in student-teacher learning and context-aware dialog processing, and designing contrastive self-supervision approaches with Ohm, a new online hard-negative mining approach. We show that leveraging our new framework compared to traditional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Multimodal Machine Learning Applications · Multi-Agent Systems and Negotiation
