TL;DR
This paper introduces new tools and a multilingual neural machine translation model capable of translating from 500 source languages to English, addressing the resource disparity among languages.
Contribution
The authors present MTData, NLCodec, and RTG tools, and develop a multilingual model for 500 languages, facilitating transfer learning to low-resource languages.
Findings
Created a multilingual translation model for 500 languages
Tools are publicly available for research and transfer learning
Model can be used as a service or for further training
Abstract
While there are more than 7000 languages in the world, most translation research efforts have targeted a few high-resource languages. Commercial translation systems support only one hundred languages or fewer, and do not make these models available for transfer to low resource languages. In this work, we present useful tools for machine translation research: MTData, NLCodec, and RTG. We demonstrate their usefulness by creating a multilingual neural machine translation model capable of translating from 500 source languages to English. We make this multilingual model readily downloadable and usable as a service, or as a parent model for transfer-learning to even lower-resource languages.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methodstravel james
