HausaNLP: Current Status, Challenges and Future Directions for Hausa Natural Language Processing
Shamsuddeen Hassan Muhammad, Ibrahim Said Ahmad, Idris Abdulmumin, Falalu Ibrahim Lawan, Babangida Sani, Sukairaj Hafiz Imam, Yusuf Aliyu, Sani Abdullahi Sani, Ali Usman Umar, Tajuddeen Gwadabe, Kenneth Church, Vukosi Marivate

TL;DR
This paper reviews the current state, challenges, and future directions of Hausa NLP, highlighting resource limitations and proposing strategic research avenues to advance language processing for Hausa, a low-resource language with millions of speakers.
Contribution
It provides a comprehensive overview of Hausa NLP, introduces a curated resource catalog, and discusses strategies to overcome challenges in model development and community engagement.
Findings
Hausa NLP has limited open-source datasets and model representations.
A curated catalog (HausaNLP) aggregates resources to facilitate research.
Identifies key challenges in integrating Hausa into large language models.
Abstract
Hausa Natural Language Processing (NLP) has gained increasing attention in recent years, yet remains understudied as a low-resource language despite having over 120 million first-language (L1) and 80 million second-language (L2) speakers worldwide. While significant advances have been made in high-resource languages, Hausa NLP faces persistent challenges, including limited open-source datasets and inadequate model representation. This paper presents an overview of the current state of Hausa NLP, systematically examining existing resources, research contributions, and gaps across fundamental NLP tasks: text classification, machine translation, named entity recognition, speech recognition, and question answering. We introduce HausaNLP (https://catalog.hausanlp.org), a curated catalog that aggregates datasets, tools, and research works to enhance accessibility and drive further…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
MethodsSoftmax · Attention Is All You Need
