Large Scale Substitution-based Word Sense Induction
Matan Eyal, Shoval Sadde, Hillel Taub-Tabib, Yoav Goldberg

TL;DR
This paper introduces a scalable method for word sense induction using pre-trained masked language models, producing high-quality sense-tagged corpora and embeddings that outperform existing methods on multiple datasets.
Contribution
The paper presents a novel, scalable approach to word sense induction leveraging MLMs, enabling corpus-specific sense discovery and high-quality sense embeddings.
Findings
Induced senses are of high quality, comparable to WSD methods like Babelfy.
Senseful embeddings outperform existing methods on WiC and outlier detection datasets.
The method can induce domain-specific senses not present in standard inventories.
Abstract
We present a word-sense induction method based on pre-trained masked language models (MLMs), which can cheaply scale to large vocabularies and large corpora. The result is a corpus which is sense-tagged according to a corpus-derived sense inventory and where each sense is associated with indicative words. Evaluation on English Wikipedia that was sense-tagged using our method shows that both the induced senses, and the per-instance sense assignment, are of high quality even compared to WSD methods, such as Babelfy. Furthermore, by training a static word embeddings algorithm on the sense-tagged corpus, we obtain high-quality static senseful embeddings. These outperform existing senseful embeddings methods on the WiC dataset and on a new outlier detection dataset we developed. The data driven nature of the algorithm allows to induce corpora-specific senses, which may not appear in standard…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
