Diverse Embedding Neural Network Language Models

Kartik Audhkhasi; Abhinav Sethy; Bhuvana Ramabhadran

arXiv:1412.7063·cs.CL·April 17, 2015·ICLR·1 cites

Diverse Embedding Neural Network Language Models

Kartik Audhkhasi, Abhinav Sethy, Bhuvana Ramabhadran

PDF

Open Access

TL;DR

This paper introduces DENN, a novel neural network architecture for language modeling that projects input histories onto multiple diverse sub-spaces, improving performance on the Penn Treebank dataset.

Contribution

The paper presents a new architecture, DENN, which enhances language models by encouraging diversity in sub-space projections during training.

Findings

01

Improved perplexity on Penn Treebank dataset

02

Diverse sub-space projections enhance model performance

03

Augmented loss function effectively promotes diversity

Abstract

We propose Diverse Embedding Neural Network (DENN), a novel architecture for language models (LMs). A DENNLM projects the input word history vector onto multiple diverse low-dimensional sub-spaces instead of a single higher-dimensional sub-space as in conventional feed-forward neural network LMs. We encourage these sub-spaces to be diverse during network training through an augmented loss function. Our language modeling experiments on the Penn Treebank data set show the performance benefit of using a DENNLM.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Neural Networks and Applications