CoVeR: Learning Covariate-Specific Vector Representations with Tensor Decompositions
Kevin Tian, Teng Zhang, James Zou

TL;DR
CoVeR introduces a tensor decomposition model that learns covariate-specific word embeddings, capturing additional document information such as author or venue, with improved data efficiency and interpretability over existing methods.
Contribution
The paper presents CoVeR, a novel tensor decomposition approach for jointly learning base and covariate-specific embeddings, enhancing interpretability and effectiveness in modeling covariate effects.
Findings
Outperforms standard covariate-specific embedding methods
Embeddings are topic-aligned with independent dimension meanings
Enables covariate effect analysis through topic comparison
Abstract
Word embedding is a useful approach to capture co-occurrence structures in large text corpora. However, in addition to the text data itself, we often have additional covariates associated with individual corpus documents---e.g. the demographic of the author, time and venue of publication---and we would like the embedding to naturally capture this information. We propose CoVeR, a new tensor decomposition model for vector embeddings with covariates. CoVeR jointly learns a \emph{base} embedding for all the words as well as a weighted diagonal matrix to model how each covariate affects the base embedding. To obtain author or venue-specific embedding, for example, we can then simply multiply the base embedding by the associated transformation matrix. The main advantages of our approach are data efficiency and interpretability of the covariate transformation. Our experiments demonstrate that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning in Healthcare
MethodsInterpretability
