Improving LLMs for Machine Translation Using Synthetic Preference Data
Dario Vajda, Domen Vre\v{s}, Marko Robnik-\v{S}ikonja

TL;DR
This paper demonstrates that a general instruction-tuned large language model can be effectively improved for machine translation by using synthetic preference data and Direct Preference Optimization, leading to better translation quality and fewer errors.
Contribution
The study introduces a novel approach of enhancing LLMs for machine translation with synthetic preference data and DPO, specifically applied to Slovene translation tasks.
Findings
Fine-tuned model outperforms baseline models in COMET scores.
The approach reduces language and formatting errors.
Achieves translation quality improvements with relatively little data.
Abstract
Large language models have emerged as effective machine translation systems. In this paper, we explore how a general instruction-tuned large language model can be improved for machine translation using relatively few easily produced data resources. Using Slovene as a use case, we improve the GaMS-9B-Instruct model using Direct Preference Optimization (DPO) training on a programmatically curated and enhanced subset of a public dataset. As DPO requires pairs of quality-ranked instances, we generated its training dataset by translating English Wikipedia articles using two LLMs, GaMS-9B-Instruct and EuroLLM-9B-Instruct. We ranked the resulting translations based on heuristics coupled with automatic evaluation metrics such as COMET. The evaluation shows that our fine-tuned model outperforms both models involved in the dataset generation. In comparison to the baseline models, the fine-tuned…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
