Development of a WAZOBIA-Named Entity Recognition System
S.E Emedem, I.E Onyenwe, E. G Onyedinma

TL;DR
This paper introduces WAZOBIA-NER, a system for recognizing named entities in Nigerian languages, utilizing annotated datasets, advanced machine learning models, and OCR technology to overcome resource limitations.
Contribution
The study develops the first comprehensive NER system for Hausa, Yoruba, and Igbo, combining OCR, annotated datasets, and modern NLP models to improve entity recognition in under-resourced languages.
Findings
Achieved high precision and recall in NER tasks across three Nigerian languages.
Demonstrated the feasibility of using OCR and transfer learning for under-resourced languages.
Evaluated multiple models, with the best achieving over 95% F1-score.
Abstract
Named Entity Recognition NER is very crucial for various natural language processing applications, including information extraction, machine translation, and sentiment analysis. Despite the ever-increasing interest in African languages within computational linguistics, existing NER systems focus mainly on English, European, and a few other global languages, leaving a significant gap for under-resourced languages. This research presents the development of a WAZOBIA-NER system tailored for the three most prominent Nigerian languages: Hausa, Yoruba, and Igbo. This research begins with a comprehensive compilation of annotated datasets for each language, addressing data scarcity and linguistic diversity challenges. Exploring the state-of-the-art machine learning technique, Conditional Random Fields (CRF) and deep learning models such as Bidirectional Long Short-Term Memory (BiLSTM),…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFocus
