Generalizing and Hybridizing Count-based and Neural Language Models

Graham Neubig; Chris Dyer

arXiv:1606.00499·cs.CL·September 27, 2016

Generalizing and Hybridizing Count-based and Neural Language Models

Graham Neubig, Chris Dyer

PDF

1 Repo

TL;DR

This paper introduces a unified framework that combines count-based n-gram models and neural language models into hybrid models, leveraging the strengths of both for improved language modeling performance.

Contribution

It proposes a novel unified modeling framework that dynamically mixes count-based and neural probability distributions, enabling the creation of hybrid language models with combined benefits.

Findings

01

Hybrid models outperform pure count-based and neural models in experiments.

02

The framework allows flexible integration of different modeling paradigms.

03

Hybrid models demonstrate improved scalability and accuracy.

Abstract

Language models (LMs) are statistical models that calculate probabilities over sequences of words or other discrete symbols. Currently two major paradigms for language modeling exist: count-based n-gram models, which have advantages of scalability and test-time speed, and neural LMs, which often achieve superior modeling performance. We demonstrate how both varieties of models can be unified in a single modeling framework that defines a set of probability distributions over the vocabulary of words, and then dynamically calculates mixture weights over these distributions. This formulation allows us to create novel hybrid models that combine the desirable features of count-based and neural LMs, and experiments demonstrate the advantages of these approaches.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

neubig/modlm
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.