LSTM Language Models for LVCSR in First-Pass Decoding and Lattice-Rescoring
Eugen Beck, Wei Zhou, Ralf Schl\"uter, Hermann Ney

TL;DR
This paper introduces an efficient method for integrating LSTM language models into LVCSR decoding by combining first-pass decoding with lattice rescoring, achieving competitive results with real-time processing on GPU hardware.
Contribution
It presents a novel approach that enables the effective use of LSTM-LMs in first-pass decoding and lattice rescoring, improving efficiency and performance in LVCSR systems.
Findings
Achieved competitive results on Hub5'00 and Librispeech datasets.
Demonstrated real-time capable decoding on GPU hardware.
Explored full sum over state-sequences during decoding.
Abstract
LSTM based language models are an important part of modern LVCSR systems as they significantly improve performance over traditional backoff language models. Incorporating them efficiently into decoding has been notoriously difficult. In this paper we present an approach based on a combination of one-pass decoding and lattice rescoring. We perform decoding with the LSTM-LM in the first pass but recombine hypothesis that share the last two words, afterwards we rescore the resulting lattice. We run our systems on GPGPU equipped machines and are able to produce competitive results on the Hub5'00 and Librispeech evaluation corpora with a runtime better than real-time. In addition we shortly investigate the possibility to carry out the full sum over all state-sequences belonging to a given word-hypothesis during decoding without recombination.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Topic Modeling
