PolyBERT: Fine-Tuned Poly Encoder BERT-Based Model for Word Sense Disambiguation
Linhan Xia, Mingzhan Yang, Guohui Yuan, Shengnan Tao, Yujing Qiu, Guo Yu, Kai Lei

TL;DR
PolyBERT introduces a balanced semantic encoding and efficient training method for Word Sense Disambiguation, significantly improving accuracy and reducing computational costs compared to previous BERT-based approaches.
Contribution
It proposes a novel poly-encoder with multi-head attention for balanced semantics and batch contrastive learning to reduce training costs in WSD.
Findings
PolyBERT outperforms baseline methods by 2% in F1-score.
Batch contrastive learning reduces GPU hours by 37.6%.
Balanced semantics improve WSD performance.
Abstract
Mainstream Word Sense Disambiguation (WSD) approaches have employed BERT to extract semantics from both context and definitions of senses to determine the most suitable sense of a target word, achieving notable performance. However, there are two limitations in these approaches. First, previous studies failed to balance the representation of token-level (local) and sequence-level (global) semantics during feature extraction, leading to insufficient semantic representation and a performance bottleneck. Second, these approaches incorporated all possible senses of each target word during the training phase, leading to unnecessary computational costs. To overcome these limitations, this paper introduces a poly-encoder BERT-based model with batch contrastive learning for WSD, named PolyBERT. Compared with previous WSD methods, PolyBERT has two improvements: (1) A poly-encoder with a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Speech and dialogue systems
