Towards Boosting Many-to-Many Multilingual Machine Translation with   Large Language Models

Pengzhi Gao; Zhongjun He; Hua Wu; Haifeng Wang

arXiv:2401.05861·cs.CL·February 8, 2024·2 cites

Towards Boosting Many-to-Many Multilingual Machine Translation with Large Language Models

Pengzhi Gao, Zhongjun He, Hua Wu, Haifeng Wang

PDF

Open Access 1 Repo

TL;DR

This paper enhances many-to-many multilingual machine translation in large language models by emphasizing prompt strategies and introducing a regularization technique, XConST, to improve zero-shot translation across multiple languages.

Contribution

It adapts the CrossConST regularization for translation instruction finetuning in LLMs, significantly boosting zero-shot multilingual translation performance.

Findings

01

XConST improves zero-shot translation accuracy.

02

Prompt strategies are crucial for multilingual LLM translation.

03

Method shows consistent gains on multiple benchmarks.

Abstract

The training paradigm for machine translation has gradually shifted, from learning neural machine translation (NMT) models with extensive parallel corpora to instruction finetuning on multilingual large language models (LLMs) with high-quality translation pairs. In this paper, we focus on boosting many-to-many multilingual translation of LLMs with an emphasis on zero-shot translation directions. We demonstrate that prompt strategies adopted during finetuning are crucial to zero-shot translation and introduce a cross-lingual consistency regularization, XConST, to bridge the representation gap among different languages and improve zero-shot translation performance. XConST is not a new method, but a version of CrossConST (Gao et al., 2023a) adapted for translation instruction finetuning with LLMs. Experimental results on ALMA (Xu et al., 2023), Tower (Team, 2024), and LLaMA-2 (Touvron et…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gpengzhi/crossconst-llm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling

MethodsFocus