Bootstrapping NLP tools across low-resourced African languages: an overview and prospects
C. Maria Keet

TL;DR
This paper reviews methods for developing NLP tools for low-resource African languages, highlighting successes in rule-based bootstrapping and challenges in data-driven approaches due to lexical diversity.
Contribution
It provides an overview of bootstrapping strategies for Niger-Congo B languages, emphasizing the potential of linguistics-informed measures to improve NLP tool development.
Findings
Bootstrapping grammars yields positive results for morphology and grammar rules.
Data-driven NLP approaches face challenges due to lexical diversity.
Cladistic and similarity measures may enhance bootstrapping strategies.
Abstract
Computing and Internet access are substantially growing markets in Southern Africa, which brings with it increasing demands for local content and tools in indigenous African languages. Since most of those languages are low-resourced, efforts have gone into the notion of bootstrapping tools for one African language from another. This paper provides an overview of these efforts for Niger-Congo B (`Bantu') languages. Bootstrapping grammars for geographically distant languages has been shown to still have positive outcomes for morphology and rules or grammar-based natural language generation. Bootstrapping with data-driven approaches to NLP tasks is difficult to use meaningfully regardless geographic proximity, which is largely due to lexical diversity due to both orthography and vocabulary. Cladistic approaches in comparative linguistics may inform bootstrapping strategies and similarity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · linguistics and terminology studies · Language and cultural evolution
