Ngambay-French Neural Machine Translation (sba-Fr)

Sakayo Toadoum Sari; Angela Fan; Lema Logamou Seknewna

arXiv:2308.13497·cs.CL·August 28, 2023

Ngambay-French Neural Machine Translation (sba-Fr)

Sakayo Toadoum Sari, Angela Fan, Lema Logamou Seknewna

PDF

Open Access 1 Repo

TL;DR

This paper introduces the first Ngambay-French translation dataset and fine-tunes models for low-resource NMT, demonstrating the effectiveness of the M2M100 model with high BLEU scores.

Contribution

Created the first Ngambay-French translation dataset and fine-tuned pre-trained models for low-resource NMT in Chad.

Findings

01

M2M100 outperforms other models in BLEU scores

02

The dataset enables further research in Ngambay language translation

03

Synthetic data improves translation performance

Abstract

In Africa, and the world at large, there is an increasing focus on developing Neural Machine Translation (NMT) systems to overcome language barriers. NMT for Low-resource language is particularly compelling as it involves learning with limited labelled data. However, obtaining a well-aligned parallel corpus for low-resource languages can be challenging. The disparity between the technological advancement of a few global languages and the lack of research on NMT for local languages in Chad is striking. End-to-end NMT trials on low-resource Chad languages have not been attempted. Additionally, there is a dearth of online and well-structured data gathering for research in Natural Language Processing, unlike some African languages. However, a guided approach for data gathering can produce bitext data for many Chadian language translation pairs with well-known languages that have ample data.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Toadoum/Ngambay-French-Neural-Machine-Translation-sba_fr_v1-
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification

MethodsFocus