Using BERT for Word Sense Disambiguation

Jiaju Du; Fanchao Qi; Maosong Sun

arXiv:1909.08358·cs.CL·September 19, 2019·31 cites

Using BERT for Word Sense Disambiguation

Jiaju Du, Fanchao Qi, Maosong Sun

PDF

Open Access

TL;DR

This paper leverages BERT to improve Word Sense Disambiguation by creating better sense representations and training a unified classifier, achieving state-of-the-art results on standard benchmarks.

Contribution

It introduces a novel approach combining BERT with sense definitions for a unified WSD classifier capable of disambiguating unseen polysemes.

Findings

01

Achieved state-of-the-art performance on English All-word WSD

02

Demonstrated effectiveness of sense definitions in training

03

Unified classifier handles unseen polysemes

Abstract

Word Sense Disambiguation (WSD), which aims to identify the correct sense of a given polyseme, is a long-standing problem in NLP. In this paper, we propose to use BERT to extract better polyseme representations for WSD and explore several ways of combining BERT and the classifier. We also utilize sense definitions to train a unified classifier for all words, which enables the model to disambiguate unseen polysemes. Experiments show that our model achieves the state-of-the-art results on the standard English All-word WSD evaluation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis

MethodsLinear Layer · Residual Connection · Attention Dropout · Linear Warmup With Linear Decay · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Adam · WordPiece · Softmax