Translating Similar Languages: Role of Mutual Intelligibility in Multilingual Transformers
Ife Adebara, El Moatez Billah Nagoudi, Muhammad Abdul Mageed

TL;DR
This paper explores how mutual intelligibility affects translation quality between similar low-resource languages using Transformer models, demonstrating that higher mutual intelligibility correlates with better performance.
Contribution
It introduces bilingual and multilingual Transformer systems for similar languages, analyzes the impact of mutual intelligibility, and shows that bilingual models generally outperform multilingual ones.
Findings
Mutual intelligibility positively correlates with translation performance.
Back-translation improves BLEU scores by over 3 points.
Spanish-Catalan translation achieved the best results.
Abstract
We investigate different approaches to translate between similar languages under low resource conditions, as part of our contribution to the WMT 2020 Similar Languages Translation Shared Task. We submitted Transformer-based bilingual and multilingual systems for all language pairs, in the two directions. We also leverage back-translation for one of the language pairs, acquiring an improvement of more than 3 BLEU points. We interpret our results in light of the degree of mutual intelligibility (based on Jaccard similarity) between each pair, finding a positive correlation between mutual intelligibility and model performance. Our Spanish-Catalan model has the best performance of all the five language pairs. Except for the case of Hindi-Marathi, our bilingual models achieve better performance than the multilingual models on all pairs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
