CL-IMS @ DIACR-Ita: Volente o Nolente: BERT does not outperform SGNS on   Semantic Change Detection

Severin Laicher; Gioia Baldissin; Enrique Casta\~neda; Dominik; Schlechtweg; Sabine Schulte im Walde

arXiv:2011.07247·cs.CL·December 4, 2020

CL-IMS @ DIACR-Ita: Volente o Nolente: BERT does not outperform SGNS on Semantic Change Detection

Severin Laicher, Gioia Baldissin, Enrique Casta\~neda, Dominik, Schlechtweg, Sabine Schulte im Walde

PDF

1 Repo

TL;DR

This paper evaluates BERT's effectiveness in detecting lexical semantic change in Italian, finding it does not outperform traditional methods like SGNS despite tuning on English data.

Contribution

The study provides an empirical comparison of BERT-based embeddings versus SGNS in semantic change detection for Italian, highlighting limitations of BERT in this task.

Findings

01

BERT-based embeddings achieved 72% accuracy in Italian semantic change detection.

02

Parameter tuning on English data does not improve performance on Italian data.

03

BERT does not outperform SGNS in lexical semantic change detection for Italian.

Abstract

We present the results of our participation in the DIACR-Ita shared task on lexical semantic change detection for Italian. We exploit Average Pairwise Distance of token-based BERT embeddings between time points and rank 5 (of 8) in the official ranking with an accuracy of $.72$ . While we tune parameters on the English data set of SemEval-2020 Task 1 and reach high performance, this does not translate to the Italian DIACR-Ita data set. Our results show that we do not manage to find robust ways to exploit BERT embeddings in lexical semantic change detection.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Garrafao/TokenChange
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Dropout · Softmax · Multi-Head Attention · Attention Dropout · Residual Connection · Dense Connections · WordPiece · Layer Normalization · Adam