PMMT: Preference Alignment in Multilingual Machine Translation via LLM   Distillation

Shuqiao Sun; Yutong Yao; Peiwen Wu; Feijun Jiang; Kaifu Zhang

arXiv:2410.11410·cs.CL·October 16, 2024

PMMT: Preference Alignment in Multilingual Machine Translation via LLM Distillation

Shuqiao Sun, Yutong Yao, Peiwen Wu, Feijun Jiang, Kaifu Zhang

PDF

Open Access

TL;DR

This paper introduces PMMT, a novel approach using Large Language Models to generate multilingual corpora aligned with human preferences and distill these preferences into smaller MT models, improving translation quality and efficiency.

Contribution

It presents a new method for generating preference-aligned multilingual data and distilling human preferences into compact MT models, advancing personalized translation.

Findings

01

Outperforms existing methods on preference-aligned translation tasks

02

Achieves competitive results on WMT and Flores benchmarks

03

Efficiently supports large-scale online translation services

Abstract

Translation is important for cross-language communication, and many efforts have been made to improve its accuracy. However, less investment is conducted in aligning translations with human preferences, such as translation tones or styles. In this paper, a new method is proposed to effectively generate large-scale multilingual parallel corpora with specific translation preferences using Large Language Models (LLMs). Meanwhile, an automatic pipeline is designed to distill human preferences into smaller Machine Translation (MT) models for efficiently and economically supporting large-scale calls in online services. Experiments indicate that the proposed method takes the lead in translation tasks with aligned human preferences by a large margin. Meanwhile, on popular public benchmarks like WMT and Flores, on which our models were not trained, the proposed method also shows a competitive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling