Towards Lingua Franca Named Entity Recognition with BERT
Taesun Moon, Parul Awasthy, Jian Ni, Radu Florian

TL;DR
This paper presents a multilingual BERT-based Named Entity Recognition model trained on multiple languages simultaneously, achieving state-of-the-art results and enabling zero-shot predictions on unseen languages.
Contribution
It introduces a single multilingual NER model trained jointly on many languages, improving accuracy and zero-shot capabilities compared to language-specific models.
Findings
Achieves state-of-the-art results on Dutch, Spanish, Arabic, and Chinese datasets.
Performs competitively with monolingual models across multiple languages.
Demonstrates effective zero-shot NER on unseen languages.
Abstract
Information extraction is an important task in NLP, enabling the automatic extraction of data for relational database filling. Historically, research and data was produced for English text, followed in subsequent years by datasets in Arabic, Chinese (ACE/OntoNotes), Dutch, Spanish, German (CoNLL evaluations), and many others. The natural tendency has been to treat each language as a different dataset and build optimized models for each. In this paper we investigate a single Named Entity Recognition model, based on a multilingual BERT, that is trained jointly on many languages simultaneously, and is able to decode these languages with better accuracy than models trained only on one language. To improve the initial model, we study the use of regularization strategies such as multitask learning and partial gradient updates. In addition to being a single model that can tackle multiple…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
MethodsLinear Layer · Weight Decay · Residual Connection · Adam · Layer Normalization · Softmax · Attention Is All You Need · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention
