Structured Language Modeling for Speech Recognition
Ciprian Chelba, Frederick Jelinek (CLSP, The Johns Hopkins University)

TL;DR
This paper introduces a structured language model that incrementally builds hierarchical syntactic structures to improve speech recognition accuracy, outperforming traditional trigram models in perplexity and word error rate.
Contribution
It presents a novel structured language model that incorporates hierarchical syntax into speech recognition, enhancing performance over existing trigram models.
Findings
Improved perplexity (PPL) on WSJ corpus
Reduced word error rate (WER) compared to trigram models
Effective integration of hierarchical structure in language modeling
Abstract
A new language model for speech recognition is presented. The model develops hidden hierarchical syntactic-like structure incrementally and uses it to extract meaningful information from the word history, thus complementing the locality of currently used trigram models. The structured language model (SLM) and its performance in a two-pass speech recognizer --- lattice decoding --- are presented. Experiments on the WSJ corpus show an improvement in both perplexity (PPL) and word error rate (WER) over conventional trigram models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Topic Modeling
