A Focus on Neural Machine Translation for African Languages
Laura Martinus, Jade Z. Abbott

TL;DR
This paper explores neural machine translation for five South African languages, addressing data scarcity and reproducibility issues, and demonstrating promising results with publicly available resources to foster further research.
Contribution
It introduces neural translation models for African languages, providing reproducible datasets, code, and results to support future research in this low-resource context.
Findings
Neural machine translation shows promise for African languages.
Publicly available data and code facilitate further research.
Successful translation models for five South African languages.
Abstract
African languages are numerous, complex and low-resourced. The datasets required for machine translation are difficult to discover, and existing research is hard to reproduce. Minimal attention has been given to machine translation for African languages so there is scant research regarding the problems that arise when using machine translation techniques. To begin addressing these problems, we trained models to translate English to five of the official South African languages (Afrikaans, isiZulu, Northern Sotho, Setswana, Xitsonga), making use of modern neural machine translation techniques. The results obtained show the promise of using neural machine translation techniques for African languages. By providing reproducible publicly-available data, code and results, this research aims to provide a starting point for other researchers in African machine translation to compare to and build…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
