Generative and Discriminative Text Classification with Recurrent Neural   Networks

Dani Yogatama; Chris Dyer; Wang Ling; Phil Blunsom

arXiv:1703.01898·stat.ML·May 29, 2017·109 cites

Generative and Discriminative Text Classification with Recurrent Neural Networks

Dani Yogatama, Chris Dyer, Wang Ling, Phil Blunsom

PDF

Open Access 2 Repos

TL;DR

This paper compares generative and discriminative RNN models for text classification, revealing that generative models learn faster and are more robust to data shifts despite higher asymptotic error rates.

Contribution

It provides an empirical analysis of RNN-based generative versus discriminative models, highlighting their learning dynamics and robustness in shifting data environments.

Findings

01

Generative RNN models have higher asymptotic error rates than discriminative models.

02

Generative models approach their asymptotic error more rapidly during training.

03

Generative models outperform discriminative models in zero-shot and continual learning scenarios.

Abstract

We empirically characterize the performance of discriminative and generative LSTM models for text classification. We find that although RNN-based generative models are more powerful than their bag-of-words ancestors (e.g., they account for conditional dependencies across words in a document), they have higher asymptotic error rates than discriminatively trained RNN models. However we also find that generative models approach their asymptotic error rate more rapidly than their discriminative counterparts---the same pattern that Ng & Jordan (2001) proved holds for linear classification models that make more naive conditional independence assumptions. Building on this finding, we hypothesize that RNN-based generative classification models will be more robust to shifts in the data distribution. This hypothesis is confirmed in a series of experiments in zero-shot and continual learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning in Healthcare