Rule-Based, Neural and LLM Back-Translation: Comparative Insights from a   Variant of Ladin

Samuel Frontull; Georg Moser

arXiv:2407.08819·cs.CL·July 15, 2024

Rule-Based, Neural and LLM Back-Translation: Comparative Insights from a Variant of Ladin

Samuel Frontull, Georg Moser

PDF

Open Access 3 Models 5 Datasets 1 Video

TL;DR

This study compares rule-based, neural, and large language model back-translation methods for Ladin, a low-resource language, revealing that all approaches perform similarly in translation quality, with differences emerging in round-trip translation assessments.

Contribution

It provides a comparative analysis of back-translation techniques for Ladin, including rule-based, neural, and LLM methods, in a low-resource setting.

Findings

01

All methods achieve comparable translation quality.

02

Round-trip translations reveal performance differences.

03

Neural and LLM approaches can complement rule-based systems.

Abstract

This paper explores the impact of different back-translation approaches on machine translation for Ladin, specifically the Val Badia variant. Given the limited amount of parallel data available for this language (only 18k Ladin-Italian sentence pairs), we investigate the performance of a multilingual neural machine translation model fine-tuned for Ladin-Italian. In addition to the available authentic data, we synthesise further translations by using three different models: a fine-tuned neural model, a rule-based system developed specifically for this language pair, and a large language model. Our experiments show that all approaches achieve comparable translation quality in this low-resource scenario, yet round-trip translations highlight differences in model performance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Datasets

Videos

Rule-Based, Neural and LLM Back-Translation: Comparative Insights from a Variant of Ladin· underline

Taxonomy

TopicsNatural Language Processing Techniques