Efficient yet Competitive Speech Translation: FBK@IWSLT2022

Marco Gaido; Sara Papi; Dennis Fucci; Giuseppe Fiameni; Matteo Negri,; Marco Turchi

arXiv:2205.02629·cs.CL·October 19, 2023

Efficient yet Competitive Speech Translation: FBK@IWSLT2022

Marco Gaido, Sara Papi, Dennis Fucci, Giuseppe Fiameni, Matteo Negri,, Marco Turchi

PDF

Open Access 1 Repo

TL;DR

This paper presents a cost-effective speech translation system that eliminates the need for ASR pre-training, employs simple data filtering, and addresses audio segmentation issues, achieving competitive results in offline and simultaneous tasks.

Contribution

It demonstrates that ASR pre-training is unnecessary for competitive speech translation, introduces a simple data filtering method, and compares strategies to handle segmentation mismatch, reducing training costs.

Findings

01

Achieved 26.7 BLEU on MuST-C en-de corpus.

02

Improved IWSLT2020 test BLEU by 1.6 over previous best.

03

Validated lightweight training strategy effectiveness.

Abstract

The primary goal of this FBK's systems submission to the IWSLT 2022 offline and simultaneous speech translation tasks is to reduce model training costs without sacrificing translation quality. As such, we first question the need of ASR pre-training, showing that it is not essential to achieve competitive results. Second, we focus on data filtering, showing that a simple method that looks at the ratio between source and target characters yields a quality improvement of 1 BLEU. Third, we compare different methods to reduce the detrimental effect of the audio segmentation mismatch between training data manually segmented at sentence level and inference data that is automatically segmented. Towards the same goal of training cost reduction, we participate in the simultaneous task with the same model trained for offline ST. The effectiveness of our lightweight training strategy is shown by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hlt-mt/fbk-fairseq
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Speech and dialogue systems