Factorial Hidden Markov Models for Learning Representations of Natural Language
Anjan Nepal, Alexander Yates

TL;DR
This paper introduces a variational learning algorithm for Factorial Hidden Markov Models that captures global context in language representations, improving tasks like POS tagging and chunking.
Contribution
It develops an efficient variational method for learning factorial HMMs that produce context-sensitive features for language processing.
Findings
Features outperform existing methods in POS tagging.
Features are sensitive to entire input sequences.
Method is scalable to large text datasets.
Abstract
Most representation learning algorithms for language and image processing are local, in that they identify features for a data point based on surrounding points. Yet in language processing, the correct meaning of a word often depends on its global context. As a step toward incorporating global context into representation learning, we develop a representation learning algorithm that incorporates joint prediction into its technique for producing features for a word. We develop efficient variational methods for learning Factorial Hidden Markov Models from large texts, and use variational distributions to produce features for each word that are sensitive to the entire input sequence, not just to a local context window. Experiments on part-of-speech tagging and chunking indicate that the features are competitive with or better than existing state-of-the-art representation learning methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
