What Matters When Building Universal Multilingual Named Entity Recognition Models?
Jonas Golde, Patrick Haller, Alan Akbik

TL;DR
This paper systematically investigates design choices in multilingual NER models, introduces Otter, a new universal model supporting over 100 languages, and demonstrates its superior performance and efficiency.
Contribution
It provides a comprehensive analysis of architectural and training decisions in multilingual NER and introduces Otter, a state-of-the-art, efficient universal multilingual NER model.
Findings
Otter outperforms strong baselines by 5.3 F1 points.
Otter achieves competitive results with large generative models.
Systematic evaluation clarifies the impact of design choices.
Abstract
Recent progress in universal multilingual named entity recognition (NER) has been driven by advances in multilingual transformer models and task-specific architectures, loss functions, and training datasets. Despite substantial prior work, we find that many critical design decisions for such models are made without systematic justification, with architectural components, training objectives, and data sources evaluated only in combination rather than in isolation. We argue that these decisions impede progress in the field by making it difficult to identify which choices improve model performance. In this work, we conduct extensive experiments around architectures, transformer backbones, training objectives, and data composition across a wide range of languages. Based on these insights, we introduce Otter, a universal multilingual NER model supporting over 100 languages. Otter achieves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Graph Neural Networks · Natural Language Processing Techniques
