Large Language Models Are Read/Write Policy-Makers for Simultaneous   Generation

Shoutao Guo; Shaolei Zhang; Zhengrui Ma; Yang Feng

arXiv:2501.00868·cs.CL·January 3, 2025

Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation

Shoutao Guo, Shaolei Zhang, Zhengrui Ma, Yang Feng

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a novel framework where large language models act as policy-makers in simultaneous generation tasks, optimizing output timing to balance latency and quality, and achieving state-of-the-art results in translation and speech recognition.

Contribution

It proposes the LSG framework enabling off-the-shelf LLMs to decide generation timing, a capability not effectively explored in prior methods.

Findings

01

Achieves state-of-the-art performance in simultaneous translation

02

Demonstrates practicality in streaming automatic speech recognition

03

Utilizes open-source LLMs effectively in real-world scenarios

Abstract

Simultaneous generation models write generation results while reading streaming inputs, necessitating a policy-maker to determine the appropriate output timing. Existing simultaneous generation methods generally adopt the traditional encoder-decoder architecture and learn the generation and policy-making capabilities through complex dynamic programming techniques. Although LLMs excel at text generation, they face challenges in taking on the role of policy-makers through traditional training methods, limiting their exploration in simultaneous generation. To overcome these limitations, we propose a novel LLM-driven Simultaneous Generation (LSG) framework, which allows the off-the-shelf LLM to decide the generation timing and produce output concurrently. Specifically, LSG selects the generation policy that minimizes latency as the baseline policy. Referring to the baseline policy, LSG…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ictnlp/LSG
pytorchOfficial

Videos

Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation· underline

Taxonomy

TopicsTopic Modeling

MethodsADaptive gradient method with the OPTimal convergence rate