Compensating for Data with Reasoning: Low-Resource Machine Translation with LLMs
Samuel Frontull, Thomas Str\"ohle

TL;DR
This paper introduces novel prompting methods for low-resource machine translation using LLMs, demonstrating improved translation quality through syntactic coverage and reasoning capabilities, especially when parallel data is scarce.
Contribution
It proposes Fragment-Shot Prompting and Pivoted Fragment-Shot, novel techniques that enhance low-resource translation by segmenting input and leveraging reasoning abilities of LLMs.
Findings
Fragment-Shot Prompting improves translation quality for low-resource languages.
Stronger reasoning models better utilize retrieved knowledge for translation.
Prompt engineering has limited impact when translating from low-resource to high-resource languages.
Abstract
Large Language Models (LLMs) have demonstrated strong capabilities in multilingual machine translation, sometimes even outperforming traditional neural systems. However, previous research has highlighted the challenges of using LLMs, particularly with prompt engineering, for low-resource languages. In this work, we introduce Fragment-Shot Prompting, a novel in-context learning method that segments input and retrieves translation examples based on syntactic coverage, along with Pivoted Fragment-Shot, an extension that enables translation without direct parallel data. We evaluate these methods using GPT-3.5, GPT-4o, o1-mini, LLaMA-3.3, and DeepSeek-R1 for translation between Italian and two Ladin variants, revealing three key findings: (1) Fragment-Shot Prompting is effective for translating into and between the studied low-resource languages, with syntactic coverage positively…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Artificial Intelligence in Healthcare and Education
