Enabling Zero-shot Multilingual Spoken Language Translation with Language-Specific Encoders and Decoders
Carlos Escolano, Marta R. Costa-juss\`a, Jos\'e A. R. Fonollosa,, Carlos Segura

TL;DR
This paper introduces a zero-shot multilingual spoken language translation method using language-specific encoders and decoders, which leverages existing MultiNMT data and achieves comparable translation quality without requiring multilingual SLT training data.
Contribution
It extends MultiNMT architecture to multilingual SLT with language-specific encoders and decoders, enabling zero-shot translation and improving performance with an Adapter module.
Findings
Achieves similar translation quality to bilingual baselines ($\u00b1 0.2$ BLEU)
Enables zero-shot MultiSLT without multilingual SLT training data
Adapter module improves BLEU scores by up to +6 points
Abstract
Current end-to-end approaches to Spoken Language Translation (SLT) rely on limited training resources, especially for multilingual settings. On the other hand, Multilingual Neural Machine Translation (MultiNMT) approaches rely on higher-quality and more massive data sets. Our proposed method extends a MultiNMT architecture based on language-specific encoders-decoders to the task of Multilingual SLT (MultiSLT). Our method entirely eliminates the dependency from MultiSLT data and it is able to translate while training only on ASR and MultiNMT data. Our experiments on four different languages show that coupling the speech encoder to the MultiNMT architecture produces similar quality translations compared to a bilingual baseline ( BLEU) while effectively allowing for zero-shot MultiSLT. Additionally, we propose using an Adapter module for coupling the speech inputs. This Adapter…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
