Contrastive Feedback Mechanism for Simultaneous Speech Translation

Haotian Tan; Sakriani Sakti

arXiv:2407.20524·cs.CL·April 13, 2026

Contrastive Feedback Mechanism for Simultaneous Speech Translation

Haotian Tan, Sakriani Sakti

PDF

TL;DR

This paper introduces the contrastive feedback mechanism (CFM) for simultaneous speech translation, leveraging unstable predictions as feedback to enhance translation quality across multiple languages.

Contribution

The novel CFM method uses contrastive learning to utilize unstable predictions as feedback, improving SST performance beyond existing decision policies.

Findings

01

CFM improves translation quality across 8 languages.

02

Experiments show consistent gains with 3 state-of-the-art policies.

03

CFM effectively mitigates unstable prediction impacts.

Abstract

Recent advances in simultaneous speech translation (SST) focus on the decision policies that enable the use of offline-trained ST models for simultaneous inference. These decision policies not only control the quality-latency trade-off in SST but also mitigate the impact of unstable predictions on translation quality by delaying translation for more context or discarding these predictions through stable hypothesis detection. However, these policies often overlook the potential benefits of utilizing unstable predictions. We introduce the contrastive feedback mechanism (CFM) for SST, a novel method that leverages these unstable predictions as feedback to improve translation quality. CFM guides the system to eliminate undesired model behaviors from these predictions through a contrastive objective. The experiments on 3 state-of-the-art decision policies across 8 languages in the MuST-C…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.