Adapting BERT for Word Sense Disambiguation with Gloss Selection   Objective and Example Sentences

Boon Peng Yap; Andrew Koh; Eng Siong Chng

arXiv:2009.11795·cs.CL·October 2, 2020

Adapting BERT for Word Sense Disambiguation with Gloss Selection Objective and Example Sentences

Boon Peng Yap, Andrew Koh, Eng Siong Chng

PDF

1 Repo

TL;DR

This paper presents a novel approach to word sense disambiguation by fine-tuning BERT as a relevance ranking model with data augmentation, achieving state-of-the-art results on benchmark datasets.

Contribution

It formulates WSD as a relevance ranking task and introduces a data augmentation method using WordNet examples for improved performance.

Findings

01

Achieves state-of-the-art results on English all-words WSD datasets

02

Relevance ranking formulation improves sense disambiguation accuracy

03

Data augmentation enhances model robustness and generalization

Abstract

Domain adaptation or transfer learning using pre-trained language models such as BERT has proven to be an effective approach for many natural language processing tasks. In this work, we propose to formulate word sense disambiguation as a relevance ranking task, and fine-tune BERT on sequence-pair ranking task to select the most probable sense definition given a context sentence and a list of candidate sense definitions. We also introduce a data augmentation technique for WSD using existing example sentences from WordNet. Using the proposed training objective and data augmentation technique, our models are able to achieve state-of-the-art results on the English all-words benchmark datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

BPYap/BERT-WSD
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Adam · Softmax · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Dropout · Linear Warmup With Linear Decay · Layer Normalization · Weight Decay · Attention Dropout