TL;DR
PolyLM introduces a novel language modeling approach to learn sense embeddings, effectively capturing polysemy by leveraging contextual information, and outperforms previous methods on word sense induction tasks with fewer parameters.
Contribution
It presents PolyLM, a new method that formulates sense embedding learning as a language modeling task, combining contextualization with sense-specific representations.
Findings
PolyLM outperforms previous sense embedding techniques on WSI tasks.
PolyLM matches state-of-the-art performance with significantly fewer parameters.
Code and models are publicly available for further research.
Abstract
To avoid the "meaning conflation deficiency" of word embeddings, a number of models have aimed to embed individual word senses. These methods at one time performed well on tasks such as word sense induction (WSI), but they have since been overtaken by task-specific techniques which exploit contextualized embeddings. However, sense embeddings and contextualization need not be mutually exclusive. We introduce PolyLM, a method which formulates the task of learning sense embeddings as a language modeling problem, allowing contextualization techniques to be applied. PolyLM is based on two underlying assumptions about word senses: firstly, that the probability of a word occurring in a given context is equal to the sum of the probabilities of its individual senses occurring; and secondly, that for a given occurrence of a word, one of its senses tends to be much more plausible in the context…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
