Dict-NMT: Bilingual Dictionary based NMT for Extremely Low Resource Languages
Nalin Kumar, Deepak Kumar, Subhankar Mishra

TL;DR
This paper introduces a novel NMT approach leveraging bilingual dictionaries to improve translation quality for extremely low-resource languages, demonstrating advantages over traditional methods and zero-shot capabilities.
Contribution
The paper proposes a bilingual dictionary-based NMT method tailored for extremely low-resource languages, including multilingual extensions with zero-shot translation.
Findings
Dictionary quality impacts translation performance
Method outperforms baseline models on low-resource languages
Multilingual extension enables zero-shot translation
Abstract
Neural Machine Translation (NMT) models have been effective on large bilingual datasets. However, the existing methods and techniques show that the model's performance is highly dependent on the number of examples in training data. For many languages, having such an amount of corpora is a far-fetched dream. Taking inspiration from monolingual speakers exploring new languages using bilingual dictionaries, we investigate the applicability of bilingual dictionaries for languages with extremely low, or no bilingual corpus. In this paper, we explore methods using bilingual dictionaries with an NMT model to improve translations for extremely low resource languages. We extend this work to multilingual systems, exhibiting zero-shot properties. We present a detailed analysis of the effects of the quality of dictionaries, training dataset size, language family, etc., on the translation quality.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
MethodsTest
