Are Mutually Intelligible Languages Easier to Translate?
Avital Friedland, Jonathan Zeltser, Omer Levy

TL;DR
This paper investigates whether mutual intelligibility between languages reduces the data required for training neural machine translation models, finding a strong correlation between mutual intelligibility and learning efficiency.
Contribution
It introduces the hypothesis that mutual intelligibility inversely affects the data needed for translation models and provides empirical evidence supporting this in Romance languages.
Findings
Strong correlation between mutual intelligibility and translation model learning curves
Mutual intelligibility reduces data requirements for training translation models
Empirical validation on Romance languages
Abstract
Two languages are considered mutually intelligible if their native speakers can communicate with each other, while using their own mother tongue. How does the fact that humans perceive a language pair as mutually intelligible affect the ability to learn a translation model between them? We hypothesize that the amount of data needed to train a neural ma-chine translation model is anti-proportional to the languages' mutual intelligibility. Experiments on the Romance language group reveal that there is indeed strong correlation between the area under a model's learning curve and mutual intelligibility scores obtained by studying human speakers.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
