SiLLM: Large Language Models for Simultaneous Machine Translation
Shoutao Guo, Shaolei Zhang, Zhengrui Ma, Min Zhang, Yang Feng

TL;DR
This paper introduces SiLLM, a novel approach for simultaneous machine translation that separates policy decision and translation tasks into two specialized agents, leveraging large language models for improved performance.
Contribution
The paper proposes decoupling SiMT into two agents, enabling effective use of LLMs for translation and traditional models for policy decisions, achieving state-of-the-art results.
Findings
SiLLM achieves state-of-the-art performance on two datasets.
Decoupling tasks improves translation quality and policy accuracy.
Small fine-tuning data suffices for LLM adaptation.
Abstract
Simultaneous Machine Translation (SiMT) generates translations while reading the source sentence, necessitating a policy to determine the optimal timing for reading and generating words. Despite the remarkable performance achieved by Large Language Models (LLM) across various NLP tasks, existing SiMT methods predominantly focus on conventional transformers, employing a single model to concurrently determine the policy and generate the translations. However, given the complexity of SiMT, it is challenging to effectively address both tasks with a single model. Therefore, there is a need to decouple the SiMT task into policy-decision and translation sub-tasks. We propose SiLLM, which delegates the two sub-tasks to separate agents, thereby incorporating LLM into SiMT. The policy-decision agent is managed by a conventional SiMT model, responsible for determining the translation policy. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
MethodsFocus
