SiLLM: Large Language Models for Simultaneous Machine Translation

Shoutao Guo; Shaolei Zhang; Zhengrui Ma; Min Zhang; Yang Feng

arXiv:2402.13036·cs.CL·February 21, 2024·2 cites

SiLLM: Large Language Models for Simultaneous Machine Translation

Shoutao Guo, Shaolei Zhang, Zhengrui Ma, Min Zhang, Yang Feng

PDF

Open Access 1 Repo

TL;DR

This paper introduces SiLLM, a novel approach for simultaneous machine translation that separates policy decision and translation tasks into two specialized agents, leveraging large language models for improved performance.

Contribution

The paper proposes decoupling SiMT into two agents, enabling effective use of LLMs for translation and traditional models for policy decisions, achieving state-of-the-art results.

Findings

01

SiLLM achieves state-of-the-art performance on two datasets.

02

Decoupling tasks improves translation quality and policy accuracy.

03

Small fine-tuning data suffices for LLM adaptation.

Abstract

Simultaneous Machine Translation (SiMT) generates translations while reading the source sentence, necessitating a policy to determine the optimal timing for reading and generating words. Despite the remarkable performance achieved by Large Language Models (LLM) across various NLP tasks, existing SiMT methods predominantly focus on conventional transformers, employing a single model to concurrently determine the policy and generate the translations. However, given the complexity of SiMT, it is challenging to effectively address both tasks with a single model. Therefore, there is a need to decouple the SiMT task into policy-decision and translation sub-tasks. We propose SiLLM, which delegates the two sub-tasks to separate agents, thereby incorporating LLM into SiMT. The policy-decision agent is managed by a conventional SiMT model, responsible for determining the translation policy. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ictnlp/sillm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling

MethodsFocus