CUNI-KIT System for Simultaneous Speech Translation Task at IWSLT 2022
Peter Pol\'ak, Ngoc-Quan Ngoc, Tuan-Nam Nguyen, Danni Liu, Carlos, Mullov, Jan Niehues, Ond\v{r}ej Bojar, Alexander Waibel

TL;DR
This paper presents a system for simultaneous speech translation that leverages an offline model with minimal modifications, achieving near offline performance with significantly reduced latency, and surpassing previous systems in certain regimes.
Contribution
We introduce an onlinization algorithm that enables offline models to operate effectively in real-time translation without altering the original model architecture.
Findings
Onlinized offline model is nearly as accurate as offline in low latency.
Our system is 3 times faster than offline models in latency.
Outperforms IWSLT2021 system in medium and high latency regimes.
Abstract
In this paper, we describe our submission to the Simultaneous Speech Translation at IWSLT 2022. We explore strategies to utilize an offline model in a simultaneous setting without the need to modify the original model. In our experiments, we show that our onlinization algorithm is almost on par with the offline setting while being faster than offline in terms of latency on the test set. We also show that the onlinized offline model outperforms the best IWSLT2021 simultaneous system in medium and high latency regimes and is almost on par in the low latency regime. We make our system publicly available.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Speech and dialogue systems
