Fake it Till You Make it: Self-Supervised Semantic Shifts for Monolingual Word Embedding Tasks
Maur\'icio Gruppi, Sibel Adal{\i}, Pin-Yu Chen

TL;DR
This paper introduces a self-supervised method for detecting lexical semantic change in monolingual word embeddings, improving alignment and understanding of language variation over time or across groups.
Contribution
It proposes a novel self-supervised approach that enhances alignment techniques and landmark word selection for better semantic change detection.
Findings
Significant improvements over existing alignment methods.
Effective detection of semantic change across multiple datasets.
Potential to uncover new insights in language variation.
Abstract
The use of language is subject to variation over time as well as across social groups and knowledge domains, leading to differences even in the monolingual scenario. Such variation in word usage is often called lexical semantic change (LSC). The goal of LSC is to characterize and quantify language variations with respect to word meaning, to measure how distinct two language sources are (that is, people or language models). Because there is hardly any data available for such a task, most solutions involve unsupervised methods to align two embeddings and predict semantic change with respect to a distance measure. To that end, we propose a self-supervised approach to model lexical semantic change by generating training samples by introducing perturbations of word vectors in the input corpora. We show that our method can be used for the detection of semantic change with any alignment…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Language and cultural evolution
