Domain Adapted Word Embeddings for Improved Sentiment Classification

Prathusha K Sarma; YIngyu Liang; William A Sethares

arXiv:1805.04576·cs.CL·May 15, 2018

Domain Adapted Word Embeddings for Improved Sentiment Classification

Prathusha K Sarma, YIngyu Liang, William A Sethares

PDF

Open Access 1 Repo

TL;DR

This paper introduces Domain Adapted (DA) word embeddings that combine generic and domain-specific embeddings using CCA, significantly improving sentiment classification accuracy across various models.

Contribution

It proposes a novel method to align generic and domain-specific embeddings via CCA, enhancing their effectiveness for sentiment analysis tasks.

Findings

01

DA embeddings outperform generic and DS embeddings in sentiment classification

02

Alignment via CCA improves embedding quality for domain-specific tasks

03

Significant accuracy gains observed across multiple sentence encoding algorithms

Abstract

Generic word embeddings are trained on large-scale generic corpora; Domain Specific (DS) word embeddings are trained only on data from a domain of interest. This paper proposes a method to combine the breadth of generic embeddings with the specificity of domain specific embeddings. The resulting embeddings, called Domain Adapted (DA) word embeddings, are formed by aligning corresponding word vectors using Canonical Correlation Analysis (CCA) or the related nonlinear Kernel CCA. Evaluation results on sentiment classification tasks show that the DA embeddings substantially outperform both generic and DS embeddings when used as input features to standard or state-of-the-art sentence encoding algorithms for classification.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

GallupGovt/multivac
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Sentiment Analysis and Opinion Mining · Natural Language Processing Techniques