Unsupervised Embedding-based Detection of Lexical Semantic Changes
Ehsaneddin Asgari, Christoph Ringlstetter, Hinrich Sch\"utze

TL;DR
This paper presents EmbLexChange, an unsupervised embedding-based system for detecting lexical semantic changes across languages by measuring divergence in word neighborhood profiles over time.
Contribution
It introduces a novel divergence-based method for unsupervised detection of semantic change using embedding neighborhoods and a resampling framework for reference word selection.
Findings
Effective detection of semantic change in multiple languages
Achieved second place in SemEval-2020 Task 1
Reliable detection using a resampling framework
Abstract
This paper describes EmbLexChange, a system introduced by the "Life-Language" team for SemEval-2020 Task 1, on unsupervised detection of lexical-semantic changes. EmbLexChange is defined as the divergence between the embedding based profiles of word w (calculated with respect to a set of reference words) in the source and the target domains (source and target domains can be simply two time frames t1 and t2). The underlying assumption is that the lexical-semantic change of word w would affect its co-occurring words and subsequently alters the neighborhoods in the embedding spaces. We show that using a resampling framework for the selection of reference words, we can reliably detect lexical-semantic changes in English, German, Swedish, and Latin. EmbLexChange achieved second place in the binary detection of semantic changes in the SemEval-2020.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLanguage and cultural evolution · Topic Modeling · Advanced Text Analysis Techniques
