Contextual Language Model Adaptation for Conversational Agents
Anirudh Raju, Behnam Hedayatnia, Linda Liu, Ankur Gandhe, Chandra, Khatri, Angeliki Metallinou, Anu Venkatesh, Ariya Rastrow

TL;DR
This paper introduces a DNN-based method for adapting language models in conversational agents using contextual information, leading to notable improvements in speech recognition accuracy and named entity recognition.
Contribution
The paper presents a novel framework for context-dependent language model adaptation using neural networks, enhancing ASR performance in conversational agents.
Findings
3% relative WER improvement with 1-pass decoding
6% relative WER improvement with 2-pass decoding
Up to 15% improvement in named entity recognition
Abstract
Statistical language models (LM) play a key role in Automatic Speech Recognition (ASR) systems used by conversational agents. These ASR systems should provide a high accuracy under a variety of speaking styles, domains, vocabulary and argots. In this paper, we present a DNN-based method to adapt the LM to each user-agent interaction based on generalized contextual information, by predicting an optimal, context-dependent set of LM interpolation weights. We show that this framework for contextual adaptation provides accuracy improvements under different possible mixture LM partitions that are relevant for both (1) Goal-oriented conversational agents where it's natural to partition the data by the requested application and for (2) Non-goal oriented conversational agents where the data can be partitioned using topic labels that come from predictions of a topic classifier. We obtain a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
