Enhancing Translation for Indigenous Languages: Experiments with   Multilingual Models

Atnafu Lambebo Tonja; Hellina Hailu Nigatu; Olga Kolesnikova; Grigori; Sidorov; Alexander Gelbukh; Jugal Kalita

arXiv:2305.17406·cs.CL·May 30, 2023·1 cites

Enhancing Translation for Indigenous Languages: Experiments with Multilingual Models

Atnafu Lambebo Tonja, Hellina Hailu Nigatu, Olga Kolesnikova, Grigori, Sidorov, Alexander Gelbukh, Jugal Kalita

PDF

Open Access

TL;DR

This paper explores multilingual and bilingual models to improve machine translation for indigenous American languages, demonstrating that mBART can enhance translation quality for some languages.

Contribution

It introduces transfer learning setups using M2M-100, mBART50, and Helsinki NLP models for indigenous language translation, with experimental results across eleven languages.

Findings

01

mBART improved translation for 3 of 11 languages

02

Transfer learning setups varied in effectiveness

03

Multilingual models show potential for indigenous language translation

Abstract

This paper describes CIC NLP's submission to the AmericasNLP 2023 Shared Task on machine translation systems for indigenous languages of the Americas. We present the system descriptions for three methods. We used two multilingual models, namely M2M-100 and mBART50, and one bilingual (one-to-one) -- Helsinki NLP Spanish-English translation model, and experimented with different transfer learning setups. We experimented with 11 languages from America and report the setups we used as well as the results we achieved. Overall, the mBART setup was able to improve upon the baseline for three out of the eleven languages.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications

MethodsmBART