Contextualized language models for semantic change detection: lessons   learned

Andrey Kutuzov; Erik Velldal; Lilja {\O}vrelid

arXiv:2209.00154·cs.CL·September 2, 2022

Contextualized language models for semantic change detection: lessons learned

Andrey Kutuzov, Erik Velldal, Lilja {\O}vrelid

PDF

1 Repo

TL;DR

This paper critically analyzes contextualized language models for detecting semantic change over time, highlighting their limitations and proposing an ensemble method that improves detection accuracy.

Contribution

It introduces an ensemble approach for better semantic change detection and provides an in-depth analysis of the challenges faced by contextualized models.

Findings

01

Contextualized models often falsely predict semantic change where none exists.

02

Pre-trained models confound lexical sense changes with contextual variance.

03

Models tend to merge syntactic and semantic aspects of words.

Abstract

We present a qualitative analysis of the (potentially erroneous) outputs of contextualized embedding-based methods for detecting diachronic semantic change. First, we introduce an ensemble method outperforming previously described contextualized approaches. This method is used as a basis for an in-depth analysis of the degrees of semantic change predicted for English words across 5 decades. Our findings show that contextualized methods can often predict high change scores for words which are not undergoing any real diachronic semantic shift in the lexicographic sense of the term (or at least the status of these shifts is questionable). Such challenging cases are discussed in detail with examples, and their linguistic categorization is proposed. Our conclusion is that pre-trained contextualized language models are prone to confound changes in lexicographic senses and changes in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ltgoslo/lscd_lessons
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.