Combining the Best of Both Worlds: A Method for Hybrid NMT and LLM Translation

Zhanglin Wu; Daimeng Wei; Xiaoyu Chen; Hengchao Shang; Jiaxin Guo; Zongyao Li; Yuanchang Luo; Jinlong Yang; Zhiqiang Rao; Hao Yang

arXiv:2505.13554·cs.CL·May 21, 2025

Combining the Best of Both Worlds: A Method for Hybrid NMT and LLM Translation

Zhanglin Wu, Daimeng Wei, Xiaoyu Chen, Hengchao Shang, Jiaxin Guo, Zongyao Li, Yuanchang Luo, Jinlong Yang, Zhiqiang Rao, Hao Yang

PDF

Open Access

TL;DR

This paper proposes a hybrid translation method combining NMT and LLMs, using a novel scheduling policy to optimize translation quality while minimizing costly LLM usage, demonstrated through extensive multilingual experiments.

Contribution

It introduces a new decider leveraging source features to effectively combine NMT and LLMs, reducing LLM reliance without sacrificing translation quality.

Findings

01

Achieves high translation quality with minimal LLM usage.

02

The proposed decider outperforms existing scheduling policies.

03

Extensive multilingual tests validate the approach's effectiveness.

Abstract

Large language model (LLM) shows promising performances in a variety of downstream tasks, such as machine translation (MT). However, using LLMs for translation suffers from high computational costs and significant latency. Based on our evaluation, in most cases, translations using LLMs are comparable to that generated by neural machine translation (NMT) systems. Only in particular scenarios, LLM and NMT models show respective advantages. As a result, integrating NMT and LLM for translation and using LLM only when necessary seems to be a sound solution. A scheduling policy that optimizes translation result while ensuring fast speed and as little LLM usage as possible is thereby required. We compare several scheduling policies and propose a novel and straightforward decider that leverages source sentence features. We conduct extensive experiments on multilingual test sets and the result…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings