GMU Systems for the IWSLT 2025 Low-Resource Speech Translation Shared Task

Chutong Meng; Antonios Anastasopoulos

arXiv:2505.21781·cs.CL·May 29, 2025

GMU Systems for the IWSLT 2025 Low-Resource Speech Translation Shared Task

Chutong Meng, Antonios Anastasopoulos

PDF

Open Access 1 Video

TL;DR

This paper presents GMU's low-resource speech translation systems for IWSLT 2025, utilizing fine-tuned SeamlessM4T-v2 models and various training paradigms to improve translation quality across multiple language pairs.

Contribution

We demonstrate the effectiveness of fine-tuning SeamlessM4T-v2 for speech translation and explore training strategies like multi-task learning and component initialization.

Findings

01

Direct E2E fine-tuning yields strong results.

02

Initializing with fine-tuned ASR improves performance on unseen languages.

03

Multi-task training provides slight improvements.

Abstract

This paper describes the GMU systems for the IWSLT 2025 low-resource speech translation shared task. We trained systems for all language pairs, except for Levantine Arabic. We fine-tuned SeamlessM4T-v2 for automatic speech recognition (ASR), machine translation (MT), and end-to-end speech translation (E2E ST). The ASR and MT models are also used to form cascaded ST systems. Additionally, we explored various training paradigms for E2E ST fine-tuning, including direct E2E fine-tuning, multi-task training, and parameter initialization using components from fine-tuned ASR and/or MT models. Our results show that (1) direct E2E fine-tuning yields strong results; (2) initializing with a fine-tuned ASR encoder improves ST performance on languages SeamlessM4T-v2 has not been trained on; (3) multi-task training can be slightly helpful.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

GMU Systems for the IWSLT 2025 Low-Resource Speech Translation Shared Task· underline

Taxonomy

TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Speech and dialogue systems