TL;DR
This paper fine-tunes BERT models with semantic knowledge for lexical similarity tasks in English and Finnish, achieving top rankings in English but lower performance in Finnish due to data limitations.
Contribution
It introduces a method of injecting semantic knowledge into BERT via fine-tuning on lexical semantic tasks using automatically generated substitutes.
Findings
English models ranked third and fourth in subtasks.
Finnish models performed mid-ranked, indicating data scarcity issues.
Fine-tuning improves BERT's performance on lexical similarity tasks.
Abstract
We present the MULTISEM systems submitted to SemEval 2020 Task 3: Graded Word Similarity in Context (GWSC). We experiment with injecting semantic knowledge into pre-trained BERT models through fine-tuning on lexical semantic tasks related to GWSC. We use existing semantically annotated datasets and propose to approximate similarity through automatically generated lexical substitutes in context. We participate in both GWSC subtasks and address two languages, English and Finnish. Our best English models occupy the third and fourth positions in the ranking for the two subtasks. Performance is lower for the Finnish models which are mid-ranked in the respective subtasks, highlighting the important role of data availability for fine-tuning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Layer Normalization · Dense Connections · Weight Decay · WordPiece · Residual Connection · Attention Is All You Need · Refunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Adam
