Speech-MASSIVE: A Multilingual Speech Dataset for SLU and Beyond

Beomseok Lee; Ioan Calapodescu; Marco Gaido; Matteo Negri; Laurent; Besacier

arXiv:2408.03900·cs.CL·August 8, 2024

Speech-MASSIVE: A Multilingual Speech Dataset for SLU and Beyond

Beomseok Lee, Ioan Calapodescu, Marco Gaido, Matteo Negri, Laurent, Besacier

PDF

Open Access 1 Repo 2 Datasets

TL;DR

Speech-MASSIVE is a new multilingual speech dataset designed for SLU and other speech tasks, enabling evaluation of models across 12 languages with various training scenarios.

Contribution

It introduces a large, multilingual speech dataset with annotations for SLU tasks, filling a gap in resources for diverse language and task evaluation.

Findings

01

Baseline SLU results using cascaded and end-to-end models

02

Effective zero-shot and few-shot learning scenarios demonstrated

03

Dataset supports benchmarking speech transcription, language ID, and translation

Abstract

We present Speech-MASSIVE, a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSIVE textual corpus. Speech-MASSIVE covers 12 languages from different families and inherits from MASSIVE the annotations for the intent prediction and slot-filling tasks. Our extension is prompted by the scarcity of massively multilingual SLU datasets and the growing need for versatile speech datasets to assess foundation models (LLMs, speech encoders) across languages and tasks. We provide a multimodal, multitask, multilingual dataset and report SLU baselines using both cascaded and end-to-end architectures in various training scenarios (zero-shot, few-shot, and full fine-tune). Furthermore, we demonstrate the suitability of Speech-MASSIVE for benchmarking other tasks such as speech transcription, language identification, and speech…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hlt-mt/speech-massive
pytorchOfficial

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques