Lookup-Table Recurrent Language Models for Long Tail Speech Recognition
W. Ronny Huang, Tara N. Sainath, Cal Peyser, Shankar Kumar, David, Rybach, Trevor Strohman

TL;DR
This paper presents LookupLM, a scalable RNN language model that enhances long tail speech recognition by increasing embedding table size without additional computational cost, significantly improving perplexity and WER.
Contribution
Introduction of LookupLM, a method to scale RNN language models via large embedding tables for better long tail performance without extra computation.
Findings
Improved long tail perplexity by 2.44 points.
Reduced long tail WER by 23.4%.
Achieved performance gains comparable to 6.2x more FLOPs.
Abstract
We introduce Lookup-Table Language Models (LookupLM), a method for scaling up the size of RNN language models with only a constant increase in the floating point operations, by increasing the expressivity of the embedding table. In particular, we instantiate an (additional) embedding table which embeds the previous n-gram token sequence, rather than a single token. This allows the embedding table to be scaled up arbitrarily -- with a commensurate increase in performance -- without changing the token vocabulary. Since embeddings are sparsely retrieved from the table via a lookup; increasing the size of the table adds neither extra operations to each forward pass nor extra parameters that need to be stored on limited GPU/TPU memory. We explore scaling n-gram embedding tables up to nearly a billion parameters. When trained on a 3-billion sentence corpus, we find that LookupLM improves long…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
