TL;DR
This paper explores fine-tuning BERT with ReLU over cosine similarity for multilingual word disambiguation, achieving competitive accuracy in the SemEval-2021 task.
Contribution
It introduces a novel fine-tuning approach combining ReLU activation with cosine similarity, improving model performance on multilingual word-in-context disambiguation.
Findings
ReLU over cosine similarity enhances fine-tuning effectiveness.
Achieved 92.7% accuracy, ranking fourth in EN-EN sub-track.
Different top-layers impact fine-tuning results.
Abstract
This paper presents our contribution to SemEval-2021 Task 2: Multilingual and Cross-lingual Word-in-Context Disambiguation (MCL-WiC). Our experiments cover English (EN-EN) sub-track from the multilingual setting of the task. We experiment with several pre-trained language models and investigate an impact of different top-layers on fine-tuning. We find the combination of Cosine Similarity and ReLU activation leading to the most effective fine-tuning procedure. Our best model results in accuracy 92.7%, which is the fourth-best score in EN-EN sub-track.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
