Speech-MASSIVE: A Multilingual Speech Dataset for SLU and Beyond
Beomseok Lee, Ioan Calapodescu, Marco Gaido, Matteo Negri, Laurent, Besacier

TL;DR
Speech-MASSIVE is a new multilingual speech dataset designed for SLU and other speech tasks, enabling evaluation of models across 12 languages with various training scenarios.
Contribution
It introduces a large, multilingual speech dataset with annotations for SLU tasks, filling a gap in resources for diverse language and task evaluation.
Findings
Baseline SLU results using cascaded and end-to-end models
Effective zero-shot and few-shot learning scenarios demonstrated
Dataset supports benchmarking speech transcription, language ID, and translation
Abstract
We present Speech-MASSIVE, a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSIVE textual corpus. Speech-MASSIVE covers 12 languages from different families and inherits from MASSIVE the annotations for the intent prediction and slot-filling tasks. Our extension is prompted by the scarcity of massively multilingual SLU datasets and the growing need for versatile speech datasets to assess foundation models (LLMs, speech encoders) across languages and tasks. We provide a multimodal, multitask, multilingual dataset and report SLU baselines using both cascaded and end-to-end architectures in various training scenarios (zero-shot, few-shot, and full fine-tune). Furthermore, we demonstrate the suitability of Speech-MASSIVE for benchmarking other tasks such as speech transcription, language identification, and speech…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques
