Simulstream: Open-Source Toolkit for Evaluation and Demonstration of Streaming Speech-to-Text Translation Systems

Marco Gaido; Sara Papi; Mauro Cettolo; Matteo Negri; Luisa Bentivogli

arXiv:2512.17648·cs.CL·December 22, 2025

Simulstream: Open-Source Toolkit for Evaluation and Demonstration of Streaming Speech-to-Text Translation Systems

Marco Gaido, Sara Papi, Mauro Cettolo, Matteo Negri, Luisa Bentivogli

PDF

Open Access

TL;DR

Simulstream is an open-source toolkit that enables comprehensive evaluation and live demonstration of streaming speech-to-text translation systems, supporting long-form audio, incremental decoding, re-translation, and interactive demos.

Contribution

It introduces the first unified framework for evaluating and demonstrating StreamST systems, supporting long-form audio, re-translation, and interactive web demos.

Findings

01

Supports long-form speech processing and re-translation methods.

02

Enables comparison of quality and latency across different systems.

03

Provides an interactive web interface for system demonstration.

Abstract

Streaming Speech-to-Text Translation (StreamST) requires producing translations concurrently with incoming speech, imposing strict latency constraints and demanding models that balance partial-information decision-making with high translation quality. Research efforts on the topic have so far relied on the SimulEval repository, which is no longer maintained and does not support systems that revise their outputs. In addition, it has been designed for simulating the processing of short segments, rather than long-form audio streams, and it does not provide an easy method to showcase systems in a demo. As a solution, we introduce simulstream, the first open-source framework dedicated to unified evaluation and demonstration of StreamST systems. Designed for long-form speech processing, it supports not only incremental decoding approaches, but also re-translation methods, enabling for their…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Speech and dialogue systems