BOND: BERT-Assisted Open-Domain Named Entity Recognition with Distant   Supervision

Chen Liang; Yue Yu; Haoming Jiang; Siawpeng Er; Ruijia Wang; Tuo Zhao,; Chao Zhang

arXiv:2006.15509·cs.CL·June 30, 2020

BOND: BERT-Assisted Open-Domain Named Entity Recognition with Distant Supervision

Chen Liang, Yue Yu, Haoming Jiang, Siawpeng Er, Ruijia Wang, Tuo Zhao,, Chao Zhang

PDF

1 Repo

TL;DR

This paper introduces BOND, a framework that enhances open-domain NER under distant supervision by leveraging pre-trained language models and a two-stage training process, significantly improving performance on benchmark datasets.

Contribution

BOND is a novel two-stage training framework that combines pre-trained language models with self-training to address noise and incompleteness in distant supervision for NER.

Findings

01

BOND outperforms existing methods on 5 benchmark datasets.

02

The two-stage training improves recall and precision.

03

Self-training further enhances model performance.

Abstract

We study the open-domain named entity recognition (NER) problem under distant supervision. The distant supervision, though does not require large amounts of manual annotations, yields highly incomplete and noisy distant labels via external knowledge bases. To address this challenge, we propose a new computational framework -- BOND, which leverages the power of pre-trained language models (e.g., BERT and RoBERTa) to improve the prediction performance of NER models. Specifically, we propose a two-stage training algorithm: In the first stage, we adapt the pre-trained language model to the NER tasks using the distant labels, which can significantly improve the recall and precision; In the second stage, we drop the distant labels, and propose a self-training approach to further improve the model performance. Thorough experiments on 5 benchmark datasets demonstrate the superiority of BOND…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cliang1453/BOND
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Weight Decay · Softmax · Adam · Multi-Head Attention · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Attention Dropout · Linear Warmup With Linear Decay · Dense Connections