Information Retrieval in African Languages
Hussein Suleman

TL;DR
This paper reviews the challenges and recent efforts in developing information retrieval tools for African languages, highlighting data scarcity and the need for practical solutions to address socio-economic issues.
Contribution
It provides an overview of ongoing research at the University of Cape Town on IR in African languages, emphasizing dataset limitations and future directions.
Findings
Limited datasets hinder IR development in African languages.
Recent work has begun addressing algorithmic challenges.
Data scarcity remains a major obstacle for practical IR systems.
Abstract
Developing Information Retrieval (IR) tools and techniques in African languages suffers from the dual problems of a lack of algorithms and very small test data collections. This affects the creation of practical IR systems and limits the ability to apply IR to address human and socio-economic problems, which is an urgent need in poor countries. This position paper presents an overview of recent and current work conducted at the University of Cape Town in this area. While many problems have been investigated at an early stage, limited dataset sizes for local African languages still persists as a significant limitation and stumbling block.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Information Retrieval and Search Behavior · Topic Modeling
