Combining the Best of Both Worlds: A Method for Hybrid NMT and LLM Translation
Zhanglin Wu, Daimeng Wei, Xiaoyu Chen, Hengchao Shang, Jiaxin Guo, Zongyao Li, Yuanchang Luo, Jinlong Yang, Zhiqiang Rao, Hao Yang

TL;DR
This paper proposes a hybrid translation method combining NMT and LLMs, using a novel scheduling policy to optimize translation quality while minimizing costly LLM usage, demonstrated through extensive multilingual experiments.
Contribution
It introduces a new decider leveraging source features to effectively combine NMT and LLMs, reducing LLM reliance without sacrificing translation quality.
Findings
Achieves high translation quality with minimal LLM usage.
The proposed decider outperforms existing scheduling policies.
Extensive multilingual tests validate the approach's effectiveness.
Abstract
Large language model (LLM) shows promising performances in a variety of downstream tasks, such as machine translation (MT). However, using LLMs for translation suffers from high computational costs and significant latency. Based on our evaluation, in most cases, translations using LLMs are comparable to that generated by neural machine translation (NMT) systems. Only in particular scenarios, LLM and NMT models show respective advantages. As a result, integrating NMT and LLM for translation and using LLM only when necessary seems to be a sound solution. A scheduling policy that optimizes translation result while ensuring fast speed and as little LLM usage as possible is thereby required. We compare several scheduling policies and propose a novel and straightforward decider that leverages source sentence features. We conduct extensive experiments on multilingual test sets and the result…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
