Congolese Swahili Machine Translation for Humanitarian Response
Alp \"Oktem, Eric DeLuca, Rodrigue Bashizi, Eric Paquin, Grace Tang

TL;DR
This paper presents a bidirectional neural machine translation system for Congolese Swahili and French, improving translation quality for humanitarian applications through low-resource techniques and human evaluation.
Contribution
The authors developed and publicly released a neural translation system for Congolese Swahili and French, utilizing low-resource methods and human assessments for humanitarian use.
Findings
Achieved up to 3.5 BLEU point improvements with low-resource techniques.
Human evaluation showed 75% of translations conveyed main messages.
Models and datasets are publicly available for further research.
Abstract
In this paper we describe our efforts to make a bidirectional Congolese Swahili (SWC) to French (FRA) neural machine translation system with the motivation of improving humanitarian translation workflows. For training, we created a 25,302-sentence general domain parallel corpus and combined it with publicly available data. Experimenting with low-resource methodologies like cross-dialect transfer and semi-supervised learning, we recorded improvements of up to 2.4 and 3.5 BLEU points in the SWC-FRA and FRA-SWC directions, respectively. We performed human evaluations to assess the usability of our models in a COVID-domain chatbot that operates in the Democratic Republic of Congo (DRC). Direct assessment in the SWC-FRA direction demonstrated an average quality ranking of 6.3 out of 10 with 75% of the target strings conveying the main message of the source text. For the FRA-SWC direction,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
