Towards Neural No-Resource Language Translation: A Comparative Evaluation of Approaches
Madhavendra Thakur

TL;DR
This paper evaluates approaches for translating no-resource languages with minimal data, showing large language models can outperform traditional methods and aid language preservation.
Contribution
It introduces and compares three workflows for no-resource translation, highlighting the effectiveness of in-context learning with large language models over traditional methods.
Findings
Large language models enable effective no-resource translation.
Chain-of-reasoning prompting excels with larger datasets.
Direct prompting is advantageous with very small datasets.
Abstract
No-resource languages - those with minimal or no digital representation - pose unique challenges for machine translation (MT). Unlike low-resource languages, which rely on limited but existent corpora, no-resource languages often have fewer than 100 sentences available for training. This work explores the problem of no-resource translation through three distinct workflows: fine-tuning of translation-specific models, in-context learning with large language models (LLMs) using chain-of-reasoning prompting, and direct prompting without reasoning. Using Owens Valley Paiute as a case study, we demonstrate that no-resource translation demands fundamentally different approaches from low-resource scenarios, as traditional approaches to machine translation, such as those that work for low-resource languages, fail. Empirical results reveal that, although traditional approaches fail, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
