Applying GPGPU to Recurrent Neural Network Language Model based Fast Network Search in the Real-Time LVCSR
Kyungmin Lee, Chiyoun Park, Ilhwan Kim, Namhoon Kim, Jaewon Lee

TL;DR
This paper introduces a GPGPU-based acceleration method for RNNLMs in real-time speech recognition, significantly improving decoding speed while maintaining low error rates.
Contribution
It presents a novel GPGPU application technique for RNNLM graph traversal, reducing redundant computations and data transfer to enable real-time LVCSR.
Findings
Achieved real-time decoding speeds on WSJ and in-house data
Reduced Word Error Rate by approximately 10% compared to n-gram models
Demonstrated effective GPGPU acceleration for RNNLM-based search
Abstract
Recurrent Neural Network Language Models (RNNLMs) have started to be used in various fields of speech recognition due to their outstanding performance. However, the high computational complexity of RNNLMs has been a hurdle in applying the RNNLM to a real-time Large Vocabulary Continuous Speech Recognition (LVCSR). In order to accelerate the speed of RNNLM-based network searches during decoding, we apply the General Purpose Graphic Processing Units (GPGPUs). This paper proposes a novel method of applying GPGPUs to RNNLM-based graph traversals. We have achieved our goal by reducing redundant computations on CPUs and amount of transfer between GPGPUs and CPUs. The proposed approach was evaluated on both WSJ corpus and in-house data. Experiments shows that the proposed approach achieves the real-time speed in various circumstances while maintaining the Word Error Rate (WER) to be relatively…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis
