Optimizing Bi-Encoder for Named Entity Recognition via Contrastive Learning
Sheng Zhang, Hao Cheng, Jianfeng Gao, Hoifung Poon

TL;DR
This paper introduces a contrastive learning-based bi-encoder framework for named entity recognition that effectively handles nested and flat NER, outperforming previous methods across multiple datasets.
Contribution
It proposes a novel bi-encoder approach with dynamic thresholding loss for NER, improving performance in both supervised and distantly supervised settings.
Findings
Achieves new state-of-the-art results on standard NER datasets.
Effective in both nested and flat NER scenarios.
Works well with noisy self-supervision signals.
Abstract
We present a bi-encoder framework for named entity recognition (NER), which applies contrastive learning to map candidate text spans and entity types into the same vector representation space. Prior work predominantly approaches NER as sequence labeling or span classification. We instead frame NER as a representation learning problem that maximizes the similarity between the vector representations of an entity mention and its type. This makes it easy to handle nested and flat NER alike, and can better leverage noisy self-supervision signals. A major challenge to this bi-encoder formulation for NER lies in separating non-entity spans from entity mentions. Instead of explicitly labeling all non-entity spans as the same class () as in most prior methods, we introduce a novel dynamic thresholding loss. Experiments show that our method performs well in both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies
MethodsContrastive Learning
