CoVeR: Learning Covariate-Specific Vector Representations with Tensor   Decompositions

Kevin Tian; Teng Zhang; James Zou

arXiv:1802.07839·cs.CL·July 10, 2018

CoVeR: Learning Covariate-Specific Vector Representations with Tensor Decompositions

Kevin Tian, Teng Zhang, James Zou

PDF

Open Access 1 Repo

TL;DR

CoVeR introduces a tensor decomposition model that learns covariate-specific word embeddings, capturing additional document information such as author or venue, with improved data efficiency and interpretability over existing methods.

Contribution

The paper presents CoVeR, a novel tensor decomposition approach for jointly learning base and covariate-specific embeddings, enhancing interpretability and effectiveness in modeling covariate effects.

Findings

01

Outperforms standard covariate-specific embedding methods

02

Embeddings are topic-aligned with independent dimension meanings

03

Enables covariate effect analysis through topic comparison

Abstract

Word embedding is a useful approach to capture co-occurrence structures in large text corpora. However, in addition to the text data itself, we often have additional covariates associated with individual corpus documents---e.g. the demographic of the author, time and venue of publication---and we would like the embedding to naturally capture this information. We propose CoVeR, a new tensor decomposition model for vector embeddings with covariates. CoVeR jointly learns a \emph{base} embedding for all the words as well as a weighted diagonal matrix to model how each covariate affects the base embedding. To obtain author or venue-specific embedding, for example, we can then simply multiply the base embedding by the associated transformation matrix. The main advantages of our approach are data efficiency and interpretability of the covariate transformation. Our experiments demonstrate that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

justinaL/tag
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning in Healthcare

MethodsInterpretability