Implementing contextual biasing in GPU decoder for online ASR
Iuliia Nigmatulina, Srikanth Madikeri, Esa\'u Villatoro-Tello, Petr, Motli\v{c}ek, Juan Zuluaga-Gomez, Karthik Pandia, Aravind Ganapathiraju

TL;DR
This paper introduces a method for integrating contextual biasing into real-time GPU decoding for online ASR, enabling dynamic rescoring and improving prediction accuracy without lattice generation.
Contribution
It presents a novel approach to implement contextual biasing directly in GPU decoding for online ASR, including dynamic context switching and compatibility with Kaldi GPU decoder.
Findings
Effective biasing of partial ASR predictions on GPU
Supports dynamic context switching during decoding
Code is publicly available and tested on open datasets
Abstract
GPU decoding significantly accelerates the output of ASR predictions. While GPUs are already being used for online ASR decoding, post-processing and rescoring on GPUs have not been properly investigated yet. Rescoring with available contextual information can considerably improve ASR predictions. Previous studies have proven the viability of lattice rescoring in decoding and biasing language model (LM) weights in offline and online CPU scenarios. In real-time GPU decoding, partial recognition hypotheses are produced without lattice generation, which makes the implementation of biasing more complex. The paper proposes and describes an approach to integrate contextual biasing in real-time GPU decoding while exploiting the standard Kaldi GPU decoder. Besides the biasing of partial ASR predictions, our approach also permits dynamic context switching allowing a flexible rescoring per each…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and dialogue systems · Natural Language Processing Techniques
