To Word Senses and Beyond: Inducing Concepts with Contextualized Language Models

Bastien Li\'etard; Pascal Denis; Mikaela Keller

arXiv:2406.20054·cs.CL·November 10, 2025

To Word Senses and Beyond: Inducing Concepts with Contextualized Language Models

Bastien Li\'etard, Pascal Denis, Mikaela Keller

PDF

Open Access 1 Video

TL;DR

This paper introduces Concept Induction, an unsupervised method to cluster words into shared concepts using a bi-level approach, improving understanding of lexical semantics and enhancing Word-in-Context task performance.

Contribution

It proposes a novel bi-level approach to unsupervised Concept Induction that combines local and global views, generalizing Word Sense Induction and producing effective concept embeddings.

Findings

01

Achieved BCubed F1 above 0.60 on SemCor data

02

Local and global levels mutually improve concept and sense induction

03

Concept embeddings perform competitively on Word-in-Context task

Abstract

Polysemy and synonymy are two crucial interrelated facets of lexical ambiguity. While both phenomena are widely documented in lexical resources and have been studied extensively in NLP, leading to dedicated systems, they are often being considered independently in practical problems. While many tasks dealing with polysemy (e.g. Word Sense Disambiguation or Induction) highlight the role of word's senses, the study of synonymy is rooted in the study of concepts, i.e. meanings shared across the lexicon. In this paper, we introduce Concept Induction, the unsupervised task of learning a soft clustering among words that defines a set of concepts directly from data. This task generalizes Word Sense Induction. We propose a bi-level approach to Concept Induction that leverages both a local lemma-centric view and a global cross-lexicon view to induce concepts. We evaluate the obtained clustering…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

To Word Senses and Beyond: Inducing Concepts with Contextualized Language Models· underline

Taxonomy

TopicsNatural Language Processing Techniques

MethodsSparse Evolutionary Training