PALI at SemEval-2021 Task 2: Fine-Tune XLM-RoBERTa for Word in Context Disambiguation
Shuyi Xie, Jian Ma, Haiqin Yang, Lianxin Jiang, Yang Mo, Jianping Shen

TL;DR
This paper describes a fine-tuned XLM-RoBERTa model with novel input tagging and embedding strategies that achieved top results in multilingual word-in-context disambiguation tasks at SemEval-2021.
Contribution
The paper introduces a new approach combining input tagging, embedding concatenation, and training tricks to enhance cross-lingual word sense disambiguation performance.
Findings
Achieved first place in all four cross-lingual tasks
Effective use of input tags to emphasize target words
Improved model accuracy with data augmentation and adversarial training
Abstract
This paper presents the PALI team's winning system for SemEval-2021 Task 2: Multilingual and Cross-lingual Word-in-Context Disambiguation. We fine-tune XLM-RoBERTa model to solve the task of word in context disambiguation, i.e., to determine whether the target word in the two contexts contains the same meaning or not. In the implementation, we first specifically design an input tag to emphasize the target word in the contexts. Second, we construct a new vector on the fine-tuned embeddings from XLM-RoBERTa and feed it to a fully-connected network to output the probability of whether the target word in the context has the same meaning or not. The new vector is attained by concatenating the embedding of the [CLS] token and the embeddings of the target word in the contexts. In training, we explore several tricks, such as the Ranger optimizer, data augmentation, and adversarial training, to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
