TL;DR
This paper introduces a unified framework that combines count-based n-gram models and neural language models into hybrid models, leveraging the strengths of both for improved language modeling performance.
Contribution
It proposes a novel unified modeling framework that dynamically mixes count-based and neural probability distributions, enabling the creation of hybrid language models with combined benefits.
Findings
Hybrid models outperform pure count-based and neural models in experiments.
The framework allows flexible integration of different modeling paradigms.
Hybrid models demonstrate improved scalability and accuracy.
Abstract
Language models (LMs) are statistical models that calculate probabilities over sequences of words or other discrete symbols. Currently two major paradigms for language modeling exist: count-based n-gram models, which have advantages of scalability and test-time speed, and neural LMs, which often achieve superior modeling performance. We demonstrate how both varieties of models can be unified in a single modeling framework that defines a set of probability distributions over the vocabulary of words, and then dynamically calculates mixture weights over these distributions. This formulation allows us to create novel hybrid models that combine the desirable features of count-based and neural LMs, and experiments demonstrate the advantages of these approaches.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
