The Fine-Tuning Paradox: Boosting Translation Quality Without Sacrificing LLM Abilities
David Stap, Eva Hasler, Bill Byrne, Christof Monz, Ke Tran

TL;DR
This paper investigates how fine-tuning large language models for translation improves quality but can degrade other valuable abilities, and proposes strategies to preserve these skills while enhancing translation performance.
Contribution
It provides an extensive evaluation of fine-tuning effects on LLMs' translation and other behaviors, highlighting the importance of data strategies to maintain LLM capabilities.
Findings
Fine-tuning improves translation quality but reduces steerability and document-level translation abilities.
Including monolingual data in fine-tuning preserves LLM abilities while improving translation.
Fine-tuning results in less literal translations, indicating a shift in translation style.
Abstract
Fine-tuning large language models (LLMs) for machine translation has shown improvements in overall translation quality. However, it is unclear what is the impact of fine-tuning on desirable LLM behaviors that are not present in neural machine translation models, such as steerability, inherent document-level translation abilities, and the ability to produce less literal translations. We perform an extensive translation evaluation on the LLaMA and Falcon family of models with model size ranging from 7 billion up to 65 billion parameters. Our results show that while fine-tuning improves the general translation quality of LLMs, several abilities degrade. In particular, we observe a decline in the ability to perform formality steering, to produce technical translations through few-shot examples, and to perform document-level translation. On the other hand, we observe that the model produces…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Translation Studies and Practices
MethodsLLaMA
