PMMT: Preference Alignment in Multilingual Machine Translation via LLM Distillation
Shuqiao Sun, Yutong Yao, Peiwen Wu, Feijun Jiang, Kaifu Zhang

TL;DR
This paper introduces PMMT, a novel approach using Large Language Models to generate multilingual corpora aligned with human preferences and distill these preferences into smaller MT models, improving translation quality and efficiency.
Contribution
It presents a new method for generating preference-aligned multilingual data and distilling human preferences into compact MT models, advancing personalized translation.
Findings
Outperforms existing methods on preference-aligned translation tasks
Achieves competitive results on WMT and Flores benchmarks
Efficiently supports large-scale online translation services
Abstract
Translation is important for cross-language communication, and many efforts have been made to improve its accuracy. However, less investment is conducted in aligning translations with human preferences, such as translation tones or styles. In this paper, a new method is proposed to effectively generate large-scale multilingual parallel corpora with specific translation preferences using Large Language Models (LLMs). Meanwhile, an automatic pipeline is designed to distill human preferences into smaller Machine Translation (MT) models for efficiently and economically supporting large-scale calls in online services. Experiments indicate that the proposed method takes the lead in translation tasks with aligned human preferences by a large margin. Meanwhile, on popular public benchmarks like WMT and Flores, on which our models were not trained, the proposed method also shows a competitive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
