A Probabilistic Framework for Learning Domain Specific Hierarchical Word Embeddings
Lahari Poddar, Gyorgy Szarvas, Lea Frermann

TL;DR
This paper introduces a probabilistic model for learning hierarchical, domain-specific word embeddings that adapt to different contexts within a taxonomy, improving representation accuracy for domain-specific language.
Contribution
The authors propose a structured probabilistic framework that learns multiple related word embeddings across hierarchical domains, capturing domain-specific meanings more effectively than existing models.
Findings
Model outperforms state-of-the-art methods on real-world review datasets.
Embeddings are more intuitive and domain-aware.
Scales efficiently with the number of domains.
Abstract
The meaning of a word often varies depending on its usage in different domains. The standard word embedding models struggle to represent this variation, as they learn a single global representation for a word. We propose a method to learn domain-specific word embeddings, from text organized into hierarchical domains, such as reviews in an e-commerce website, where products follow a taxonomy. Our structured probabilistic model allows vector representations for the same word to drift away from each other for distant domains in the taxonomy, to accommodate its domain-specific meanings. By learning sets of domain-specific word representations jointly, our model can leverage domain relationships, and it scales well with the number of domains. Using large real-world review datasets, we demonstrate the effectiveness of our model compared to state-of-the-art approaches, in learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
