Exploring Linguistic Similarity and Zero-Shot Learning for Multilingual Translation of Dravidian Languages
Danish Ebadulla, Rahul Raman, S. Natarajan, Hridhay Kiran Shetty,, Ashish Harish Shenoy

TL;DR
This paper presents a novel single encoder-decoder neural machine translation approach for Dravidian languages that leverages linguistic similarity and transliteration to improve zero-shot translation, reducing reliance on extensive data and compute resources.
Contribution
The authors introduce a single encoder-decoder model utilizing linguistic similarity and transliteration for zero-shot translation among Dravidian languages, outperforming traditional pivot-based methods with less data.
Findings
Achieves within 3 BLEU of pivot-based models with half the language directions
Effectively uses transliteration and linguistic similarity to enhance zero-shot translation
Reduces data and compute requirements compared to existing methods
Abstract
Current research in zero-shot translation is plagued by several issues such as high compute requirements, increased training time and off target translations. Proposed remedies often come at the cost of additional data or compute requirements. Pivot based neural machine translation is preferred over a single-encoder model for most settings despite the increased training and evaluation time. In this work, we overcome the shortcomings of zero-shot translation by taking advantage of transliteration and linguistic similarity. We build a single encoder-decoder neural machine translation system for Dravidian-Dravidian multilingual translation and perform zero-shot translation. We compare the data vs zero-shot accuracy tradeoff and evaluate the performance of our vanilla method against the current state of the art pivot based method. We also test the theory that morphologically rich languages…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis
