Rule-Based, Neural and LLM Back-Translation: Comparative Insights from a Variant of Ladin
Samuel Frontull, Georg Moser

TL;DR
This study compares rule-based, neural, and large language model back-translation methods for Ladin, a low-resource language, revealing that all approaches perform similarly in translation quality, with differences emerging in round-trip translation assessments.
Contribution
It provides a comparative analysis of back-translation techniques for Ladin, including rule-based, neural, and LLM methods, in a low-resource setting.
Findings
All methods achieve comparable translation quality.
Round-trip translations reveal performance differences.
Neural and LLM approaches can complement rule-based systems.
Abstract
This paper explores the impact of different back-translation approaches on machine translation for Ladin, specifically the Val Badia variant. Given the limited amount of parallel data available for this language (only 18k Ladin-Italian sentence pairs), we investigate the performance of a multilingual neural machine translation model fine-tuned for Ladin-Italian. In addition to the available authentic data, we synthesise further translations by using three different models: a fine-tuned neural model, a rule-based system developed specifically for this language pair, and a large language model. Our experiments show that all approaches achieve comparable translation quality in this low-resource scenario, yet round-trip translations highlight differences in model performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques
