Domain adaptation for sequence labeling using hidden Markov models

Edouard Grave (LIENS; INRIA Paris - Rocquencourt); Guillaume Obozinski; (LIGM); Francis Bach (LIENS; INRIA Paris - Rocquencourt)

arXiv:1312.4092·cs.CL·December 17, 2013·1 cites

Domain adaptation for sequence labeling using hidden Markov models

Edouard Grave (LIENS, INRIA Paris - Rocquencourt), Guillaume Obozinski, (LIGM), Francis Bach (LIENS, INRIA Paris - Rocquencourt)

PDF

Open Access

TL;DR

This paper proposes using hidden Markov models to learn domain-robust word representations for sequence labeling tasks like part-of-speech tagging, addressing domain shift issues in NLP.

Contribution

It introduces a novel approach of employing HMMs for domain adaptation in sequence labeling, exploring data from source, target, or both domains for learning representations.

Findings

01

HMM-based representations improve domain robustness.

02

Using combined domain data enhances performance.

03

The method reduces performance drop across domains.

Abstract

Most natural language processing systems based on machine learning are not robust to domain shift. For example, a state-of-the-art syntactic dependency parser trained on Wall Street Journal sentences has an absolute drop in performance of more than ten points when tested on textual data from the Web. An efficient solution to make these methods more robust to domain shift is to first learn a word representation using large amounts of unlabeled data from both domains, and then use this representation as features in a supervised learning algorithm. In this paper, we propose to use hidden Markov models to learn word representations for part-of-speech tagging. In particular, we study the influence of using data from the source, the target or both domains to learn the representation and the different ways to represent words using an HMM.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Algorithms and Data Compression