Factorial Hidden Markov Models for Learning Representations of Natural   Language

Anjan Nepal; Alexander Yates

arXiv:1312.6168·cs.LG·February 19, 2014·ICLR·2 cites

Factorial Hidden Markov Models for Learning Representations of Natural Language

Anjan Nepal, Alexander Yates

PDF

Open Access

TL;DR

This paper introduces a variational learning algorithm for Factorial Hidden Markov Models that captures global context in language representations, improving tasks like POS tagging and chunking.

Contribution

It develops an efficient variational method for learning factorial HMMs that produce context-sensitive features for language processing.

Findings

01

Features outperform existing methods in POS tagging.

02

Features are sensitive to entire input sequences.

03

Method is scalable to large text datasets.

Abstract

Most representation learning algorithms for language and image processing are local, in that they identify features for a data point based on surrounding points. Yet in language processing, the correct meaning of a word often depends on its global context. As a step toward incorporating global context into representation learning, we develop a representation learning algorithm that incorporates joint prediction into its technique for producing features for a word. We develop efficient variational methods for learning Factorial Hidden Markov Models from large texts, and use variational distributions to produce features for each word that are sensitive to the entire input sequence, not just to a local context window. Experiments on part-of-speech tagging and chunking indicate that the features are competitive with or better than existing state-of-the-art representation learning methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications