Harnessing Transfer Learning from Swahili: Advancing Solutions for Comorian Dialects
Naira Abdou Mohamed, Zakarya Erraji, Abdessalam Bahafid, Imade, Benelallam

TL;DR
This paper explores transfer learning from Swahili to develop NLP tools for Comorian dialects, demonstrating promising results in speech recognition and translation despite limited data.
Contribution
It pioneers transfer learning techniques for Comorian languages by leveraging lexical similarities with Swahili, a low-resource language adaptation.
Findings
Achieved ROUGE scores of 0.6826, 0.42, and 0.6532 in MT
Recorded a WER of 39.50% and CER of 13.76% in ASR
Demonstrated feasibility of transfer learning for underrepresented African languages
Abstract
If today some African languages like Swahili have enough resources to develop high-performing Natural Language Processing (NLP) systems, many other languages spoken on the continent are still lacking such support. For these languages, still in their infancy, several possibilities exist to address this critical lack of data. Among them is Transfer Learning, which allows low-resource languages to benefit from the good representation of other languages that are similar to them. In this work, we adopt a similar approach, aiming to pioneer NLP technologies for Comorian, a group of four languages or dialects belonging to the Bantu family. Our approach is initially motivated by the hypothesis that if a human can understand a different language from their native language with little or no effort, it would be entirely possible to model this process on a machine. To achieve this, we consider…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAfrican history and culture analysis · Language, Linguistics, Cultural Analysis · Global Maritime and Colonial Histories
MethodsADaptive gradient method with the OPTimal convergence rate · Focus
