UNER: Universal Named-Entity RecognitionFramework
Diego Alves, Tin Kuculo, Gabriel Amaral, Gaurish Thakkar, and Marko, Tadic

TL;DR
The paper presents UNER, a multilingual framework for named-entity recognition, including a new corpus and methodology for annotation and evaluation across multiple languages, aiming to enhance NER tools and expand language coverage.
Contribution
Introduction of UNER, a 4-level classification hierarchy and a methodology for creating a multilingual NER corpus with automatic annotation propagation.
Findings
First multilingual UNER corpus created from SETimes.
Evaluation of annotations through crowdsourcing.
UNER dataset used to train and test NER tools.
Abstract
We introduce the Universal Named-Entity Recognition (UNER)framework, a 4-level classification hierarchy, and the methodology that isbeing adopted to create the first multilingual UNER corpus: the SETimesparallel corpus annotated for named-entities. First, the English SETimescorpus will be annotated using existing tools and knowledge bases. Afterevaluating the resulting annotations through crowdsourcing campaigns,they will be propagated automatically to other languages within the SE-Times corpora. Finally, as an extrinsic evaluation, the UNER multilin-gual dataset will be used to train and test available NER tools. As part offuture research directions, we aim to increase the number of languages inthe UNER corpus and to investigate possible ways of integrating UNERwith available knowledge graphs to improve named-entity recognition.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Graph Neural Networks
