CUNI-KIT System for Simultaneous Speech Translation Task at IWSLT 2022

Peter Pol\'ak; Ngoc-Quan Ngoc; Tuan-Nam Nguyen; Danni Liu; Carlos; Mullov; Jan Niehues; Ond\v{r}ej Bojar; Alexander Waibel

arXiv:2204.06028·cs.CL·May 12, 2022

CUNI-KIT System for Simultaneous Speech Translation Task at IWSLT 2022

Peter Pol\'ak, Ngoc-Quan Ngoc, Tuan-Nam Nguyen, Danni Liu, Carlos, Mullov, Jan Niehues, Ond\v{r}ej Bojar, Alexander Waibel

PDF

Open Access

TL;DR

This paper presents a system for simultaneous speech translation that leverages an offline model with minimal modifications, achieving near offline performance with significantly reduced latency, and surpassing previous systems in certain regimes.

Contribution

We introduce an onlinization algorithm that enables offline models to operate effectively in real-time translation without altering the original model architecture.

Findings

01

Onlinized offline model is nearly as accurate as offline in low latency.

02

Our system is 3 times faster than offline models in latency.

03

Outperforms IWSLT2021 system in medium and high latency regimes.

Abstract

In this paper, we describe our submission to the Simultaneous Speech Translation at IWSLT 2022. We explore strategies to utilize an offline model in a simultaneous setting without the need to modify the original model. In our experiments, we show that our onlinization algorithm is almost on par with the offline setting while being $3 \times$ faster than offline in terms of latency on the test set. We also show that the onlinized offline model outperforms the best IWSLT2021 simultaneous system in medium and high latency regimes and is almost on par in the low latency regime. We make our system publicly available.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Speech and dialogue systems