Progress and Tradeoffs in Neural Language Models
Raphael Tang, Jimmy Lin

TL;DR
This paper analyzes the tradeoffs between neural language models and traditional models, highlighting how neural models improve perplexity but significantly increase energy and latency, especially on mobile devices.
Contribution
It is the first to systematically compare neural and classic language models regarding energy, latency, and accuracy across different hardware platforms.
Findings
Neural models reduce perplexity significantly.
Energy and latency increase substantially with neural models.
Differences in performance tradeoffs are more pronounced on mobile devices.
Abstract
In recent years, we have witnessed a dramatic shift towards techniques driven by neural networks for a variety of NLP tasks. Undoubtedly, neural language models (NLMs) have reduced perplexity by impressive amounts. This progress, however, comes at a substantial cost in performance, in terms of inference latency and energy consumption, which is particularly of concern in deployments on mobile devices. This paper, which examines the quality-performance tradeoff of various language modeling techniques, represents to our knowledge the first to make this observation. We compare state-of-the-art NLMs with "classic" Kneser-Ney (KN) LMs in terms of energy usage, latency, perplexity, and prediction accuracy using two standard benchmarks. On a Raspberry Pi, we find that orders of increase in latency and energy usage correspond to less change in perplexity, while the difference is much less…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Ferroelectric and Negative Capacitance Devices · Speech Recognition and Synthesis
