TL;DR
This paper evaluates BERT's effectiveness in detecting lexical semantic change in Italian, finding it does not outperform traditional methods like SGNS despite tuning on English data.
Contribution
The study provides an empirical comparison of BERT-based embeddings versus SGNS in semantic change detection for Italian, highlighting limitations of BERT in this task.
Findings
BERT-based embeddings achieved 72% accuracy in Italian semantic change detection.
Parameter tuning on English data does not improve performance on Italian data.
BERT does not outperform SGNS in lexical semantic change detection for Italian.
Abstract
We present the results of our participation in the DIACR-Ita shared task on lexical semantic change detection for Italian. We exploit Average Pairwise Distance of token-based BERT embeddings between time points and rank 5 (of 8) in the official ranking with an accuracy of . While we tune parameters on the English data set of SemEval-2020 Task 1 and reach high performance, this does not translate to the Italian DIACR-Ita data set. Our results show that we do not manage to find robust ways to exploit BERT embeddings in lexical semantic change detection.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Dropout · Softmax · Multi-Head Attention · Attention Dropout · Residual Connection · Dense Connections · WordPiece · Layer Normalization · Adam
