Learning Multi-Sense Word Distributions using Approximate Kullback-Leibler Divergence
P. Jayashree, Ballijepalli Shreya, and P.K. Srijith

TL;DR
This paper introduces a method for learning multi-sense word embeddings as Gaussian mixtures using an approximate KL divergence, improving the modeling of polysemy and uncertainty in word representations.
Contribution
It proposes a novel approach to learn multi-sense word distributions with a KL divergence-based objective, capturing semantic nuances more effectively.
Findings
Improved performance on word similarity benchmarks
Effective modeling of polysemy and uncertainty
Better capture of entailment relations
Abstract
Learning word representations has garnered greater attention in the recent past due to its diverse text applications. Word embeddings encapsulate the syntactic and semantic regularities of sentences. Modelling word embedding as multi-sense gaussian mixture distributions, will additionally capture uncertainty and polysemy of words. We propose to learn the Gaussian mixture representation of words using a Kullback-Leibler (KL) divergence based objective function. The KL divergence based energy function provides a better distance metric which can effectively capture entailment and distribution similarity among the words. Due to the intractability of KL divergence for Gaussian mixture, we go for a KL approximation between Gaussian mixtures. We perform qualitative and quantitative experiments on benchmark word similarity and entailment datasets which demonstrate the effectiveness of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
