Coarse-to-Fine Pre-training for Named Entity Recognition

Mengge Xue; Bowen Yu; Zhenyu Zhang; Tingwen Liu; Yue Zhang; Bin Wang

arXiv:2010.08210·cs.CL·October 29, 2020·1 cites

Coarse-to-Fine Pre-training for Named Entity Recognition

Mengge Xue, Bowen Yu, Zhenyu Zhang, Tingwen Liu, Yue Zhang, Bin Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a NER-specific pre-training framework that progressively injects coarse-to-fine entity knowledge into models, significantly improving performance on multiple benchmarks without relying on labeled data.

Contribution

It proposes a novel coarse-to-fine pre-training approach for NER that leverages automatically mined entity knowledge at different granularities, enhancing existing models.

Findings

01

Achieves state-of-the-art results on three NER benchmarks.

02

Improves performance in low-resource and label-few scenarios.

03

Demonstrates effectiveness without human-labeled data.

Abstract

More recently, Named Entity Recognition hasachieved great advances aided by pre-trainingapproaches such as BERT. However, currentpre-training techniques focus on building lan-guage modeling objectives to learn a gen-eral representation, ignoring the named entity-related knowledge. To this end, we proposea NER-specific pre-training framework to in-ject coarse-to-fine automatically mined entityknowledge into pre-trained models. Specifi-cally, we first warm-up the model via an en-tity span identification task by training it withWikipedia anchors, which can be deemed asgeneral-typed entities. Then we leverage thegazetteer-based distant supervision strategy totrain the model extract coarse-grained typedentities. Finally, we devise a self-supervisedauxiliary task to mine the fine-grained namedentity knowledge via clustering.Empiricalstudies on three public NER datasets demon-strate that our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

strawberryx/CoFEE
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies

MethodsLinear Layer · WordPiece · Adam · Softmax · Layer Normalization · Dense Connections · Multi-Head Attention · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Linear Warmup With Linear Decay