Tracking Semantic Change in Slovene: A Novel Dataset and Optimal Transport-Based Distance
Marko Pranji\'c, Kaja Dobrovoljc, Senja Pollak, Matej Martinc

TL;DR
This paper introduces a new Slovene dataset for semantic change detection, analyzes existing metrics, and proposes an optimal transport-based method that improves the robustness and accuracy of detecting semantic shifts.
Contribution
It provides the first Slovene semantic change dataset and develops a novel optimal transport-based metric for more effective semantic change detection.
Findings
Optimal transport metric outperforms existing measures
New dataset enables evaluation of semantic change methods
Proposed approach achieves comparable or better results
Abstract
In this paper, we focus on the detection of semantic changes in Slovene, a less resourced Slavic language with two million speakers. Detecting and tracking semantic changes provides insight into the evolution of language caused by changes in society and culture. We present the first Slovene dataset for evaluating semantic change detection systems, which contains aggregated semantic change scores for 104 target words obtained from more than 3,000 manually annotated sentence pairs. We analyze an important class of measures of semantic change metrics based on the Average pairwise distance and identify several limitations. To address these limitations, we propose a novel metric based on regularized optimal transport, which offers a more robust framework for quantifying semantic change. We provide a comprehensive evaluation of various existing semantic change detection methods and associated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuthorship Attribution and Profiling · Natural Language Processing Techniques · Language and cultural evolution
MethodsFocus
