What Matters for Neural Cross-Lingual Named Entity Recognition: An Empirical Analysis
Xiaolei Huang, Jonathan May, Nanyun Peng

TL;DR
This paper presents an empirical analysis of neural cross-lingual NER, exploring transfer learning factors, model performance variations, and potential improvements using Wikipedia data, to enhance NER in low-resource languages.
Contribution
It introduces a simple neural architecture for cross-lingual NER and analyzes key transfer factors, providing insights for future research in low-resource language NER.
Findings
Model achieves competitive performance with state-of-the-art.
Transfer factors like sequential order and multilingual embeddings influence performance.
Leveraging Wikipedia knowledge can further improve NER in non-Latin languages.
Abstract
Building named entity recognition (NER) models for languages that do not have much training data is a challenging task. While recent work has shown promising results on cross-lingual transfer from high-resource languages to low-resource languages, it is unclear what knowledge is transferred. In this paper, we first propose a simple and efficient neural architecture for cross-lingual NER. Experiments show that our model achieves competitive performance with the state-of-the-art. We further analyze how transfer learning works for cross-lingual NER on two transferable factors: sequential order and multilingual embeddings, and investigate how model performance varies across entity lengths. Finally, we conduct a case-study on a non-Latin language, Bengali, which suggests that leveraging knowledge from Wikipedia will be a promising direction to further improve the model performances. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
