Unsupervised Cross-Domain Word Representation Learning

Danushka Bollegala; Takanori Maehara; Ken-ichi Kawarabayashi

arXiv:1505.07184·cs.CL·May 28, 2015

Unsupervised Cross-Domain Word Representation Learning

Danushka Bollegala, Takanori Maehara, Ken-ichi Kawarabayashi

PDF

TL;DR

This paper introduces an unsupervised approach for learning domain-specific word representations that capture semantic variations across different domains, improving performance in domain adaptation tasks.

Contribution

It proposes a novel unsupervised method using pivot words and an objective function to learn domain-specific embeddings, outperforming existing methods.

Findings

01

Significantly outperforms baseline models in domain adaptation tasks.

02

Achieves the best sentiment classification accuracies across multiple domain pairs.

03

Effectively captures domain-specific word semantics.

Abstract

Meaning of a word varies from one domain to another. Despite this important domain dependence in word semantics, existing word representation learning methods are bound to a single domain. Given a pair of \emph{source}-\emph{target} domains, we propose an unsupervised method for learning domain-specific word representations that accurately capture the domain-specific aspects of word semantics. First, we select a subset of frequent words that occur in both domains as \emph{pivots}. Next, we optimize an objective function that enforces two constraints: (a) for both source and target domain documents, pivots that appear in a document must accurately predict the co-occurring non-pivots, and (b) word representations learnt for pivots must be similar in the two domains. Moreover, we propose a method to perform domain adaptation using the learnt word representations. Our proposed method…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.