PolyBERT: Fine-Tuned Poly Encoder BERT-Based Model for Word Sense Disambiguation

Linhan Xia; Mingzhan Yang; Guohui Yuan; Shengnan Tao; Yujing Qiu; Guo Yu; Kai Lei

arXiv:2506.00968·cs.AI·June 3, 2025

PolyBERT: Fine-Tuned Poly Encoder BERT-Based Model for Word Sense Disambiguation

Linhan Xia, Mingzhan Yang, Guohui Yuan, Shengnan Tao, Yujing Qiu, Guo Yu, Kai Lei

PDF

Open Access

TL;DR

PolyBERT introduces a balanced semantic encoding and efficient training method for Word Sense Disambiguation, significantly improving accuracy and reducing computational costs compared to previous BERT-based approaches.

Contribution

It proposes a novel poly-encoder with multi-head attention for balanced semantics and batch contrastive learning to reduce training costs in WSD.

Findings

01

PolyBERT outperforms baseline methods by 2% in F1-score.

02

Batch contrastive learning reduces GPU hours by 37.6%.

03

Balanced semantics improve WSD performance.

Abstract

Mainstream Word Sense Disambiguation (WSD) approaches have employed BERT to extract semantics from both context and definitions of senses to determine the most suitable sense of a target word, achieving notable performance. However, there are two limitations in these approaches. First, previous studies failed to balance the representation of token-level (local) and sequence-level (global) semantics during feature extraction, leading to insufficient semantic representation and a performance bottleneck. Second, these approaches incorporated all possible senses of each target word during the training phase, leading to unnecessary computational costs. To overcome these limitations, this paper introduces a poly-encoder BERT-based model with batch contrastive learning for WSD, named PolyBERT. Compared with previous WSD methods, PolyBERT has two improvements: (1) A poly-encoder with a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Speech and dialogue systems