Over-Generation Cannot Be Rewarded: Length-Adaptive Average Lagging for   Simultaneous Speech Translation

Sara Papi; Marco Gaido; Matteo Negri; Marco Turchi

arXiv:2206.05807·cs.CL·October 19, 2023

Over-Generation Cannot Be Rewarded: Length-Adaptive Average Lagging for Simultaneous Speech Translation

Sara Papi, Marco Gaido, Matteo Negri, Marco Turchi

PDF

Open Access 1 Repo

TL;DR

This paper identifies a bias in the Average Lagging metric used in simultaneous speech translation, especially for over-generating systems, and proposes a new metric called LAAL to provide unbiased evaluation.

Contribution

The paper introduces LAAL, a length-adaptive version of Average Lagging, addressing over-generation bias in simultaneous speech translation evaluation.

Findings

01

LAAL corrects bias in lagging measurement for over-generating systems

02

Recent systems tend to over-generate, affecting AL scores

03

LAAL provides a more accurate evaluation of system latency

Abstract

Simultaneous speech translation (SimulST) systems aim at generating their output with the lowest possible latency, which is normally computed in terms of Average Lagging (AL). In this paper we highlight that, despite its widespread adoption, AL provides underestimated scores for systems that generate longer predictions compared to the corresponding references. We also show that this problem has practical relevance, as recent SimulST systems have indeed a tendency to over-generate. As a solution, we propose LAAL (Length-Adaptive Average Lagging), a modified version of the metric that takes into account the over-generation phenomenon and allows for unbiased evaluation of both under-/over-generating systems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hlt-mt/fbk-fairseq
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Speech and dialogue systems