Application of Pre-training Models in Named Entity Recognition
Yu Wang, Yining Sun, Zuchang Ma, Lisheng Gao, Yang Xu, Ting Sun

TL;DR
This paper reviews pre-training models like BERT, ERNIE, and RoBERTa, and evaluates their effectiveness in Named Entity Recognition, demonstrating RoBERTa's superior performance on a benchmark dataset.
Contribution
It introduces the architectures and pre-training tasks of four models and compares their NER performance through fine-tuning experiments.
Findings
RoBERTa achieved state-of-the-art results on MSRA-2006 dataset.
Pre-training models significantly improve NER performance.
Different model architectures and pre-training tasks impact NER effectiveness.
Abstract
Named Entity Recognition (NER) is a fundamental Natural Language Processing (NLP) task to extract entities from unstructured data. The previous methods for NER were based on machine learning or deep learning. Recently, pre-training models have significantly improved performance on multiple NLP tasks. In this paper, firstly, we introduce the architecture and pre-training tasks of four common pre-training models: BERT, ERNIE, ERNIE2.0-tiny, and RoBERTa. Then, we apply these pre-training models to a NER task by fine-tuning, and compare the effects of the different model architecture and pre-training tasks on the NER task. The experiment results showed that RoBERTa achieved state-of-the-art results on the MSRA-2006 dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
MethodsERNIE · Linear Layer · Adam · Softmax · Layer Normalization · Dropout · Attention Is All You Need · Multi-Head Attention · Residual Connection · Attention Dropout
