Rethinking Metrics for Lexical Semantic Change Detection
Roksana Goworek, Haim Dubossarsky

TL;DR
This paper introduces new metrics, AMD and SAMD, for lexical semantic change detection using contextualised embeddings, demonstrating their robustness across languages and models, and suggesting they outperform traditional metrics like APD and PRT.
Contribution
The paper proposes two novel semantic change metrics, AMD and SAMD, that improve robustness and performance over traditional measures in various settings.
Findings
AMD outperforms traditional metrics under dimensionality reduction.
SAMD performs well with specialised encoders.
Alternative metrics like AMD can enhance LSCD analysis.
Abstract
Lexical semantic change detection (LSCD) increasingly relies on contextualised language model embeddings, yet most approaches still quantify change using a small set of semantic change metrics, primarily Average Pairwise Distance (APD) and cosine distance over word prototypes (PRT). We introduce Average Minimum Distance (AMD) and Symmetric Average Minimum Distance (SAMD), new measures that quantify semantic change via local correspondence between word usages across time periods. Across multiple languages, encoder models, and representation spaces, we show that AMD often provides more robust performance, particularly under dimensionality reduction and with non-specialised encoders, while SAMD excels with specialised encoders. We suggest that LSCD may benefit from considering alternative semantic change metrics beyond APD and PRT, with AMD offering a robust option for contextualised…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsLanguage and cultural evolution · Authorship Attribution and Profiling · Computational and Text Analysis Methods
