Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive   Instruction-Tuning Benchmark for Speech

Chien-yu Huang; Ke-Han Lu; Shih-Heng Wang; Chi-Yuan Hsiao; Chun-Yi; Kuan; Haibin Wu; Siddhant Arora; Kai-Wei Chang; Jiatong Shi; Yifan Peng,; Roshan Sharma; Shinji Watanabe; Bhiksha Ramakrishnan; Shady Shehata; Hung-yi; Lee

arXiv:2309.09510·eess.AS·March 25, 2024

Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech

Chien-yu Huang, Ke-Han Lu, Shih-Heng Wang, Chi-Yuan Hsiao, Chun-Yi, Kuan, Haibin Wu, Siddhant Arora, Kai-Wei Chang, Jiatong Shi, Yifan Peng,, Roshan Sharma, Shinji Watanabe, Bhiksha Ramakrishnan, Shady Shehata, Hung-yi, Lee

PDF

Open Access 1 Repo

TL;DR

Dynamic-SUPERB is a new benchmark for speech instruction tuning that covers diverse tasks and datasets, aiming to facilitate the development of universal speech models capable of zero-shot generalization.

Contribution

The paper introduces Dynamic-SUPERB, a comprehensive, community-driven benchmark for evaluating speech models on multiple tasks using instruction tuning, with baseline approaches and extensive evaluation.

Findings

01

Baselines perform well on seen tasks but poorly on unseen tasks.

02

The benchmark covers 55 evaluation instances across 33 tasks and 22 datasets.

03

Materials are publicly released to encourage collaborative advancement.

Abstract

Text language models have shown remarkable zero-shot capability in generalizing to unseen tasks when provided with well-formulated instructions. However, existing studies in speech processing primarily focus on limited or specific tasks. Moreover, the lack of standardized benchmarks hinders a fair comparison across different approaches. Thus, we present Dynamic-SUPERB, a benchmark designed for building universal speech models capable of leveraging instruction tuning to perform multiple tasks in a zero-shot fashion. To achieve comprehensive coverage of diverse speech tasks and harness instruction tuning, we invite the community to collaborate and contribute, facilitating the dynamic growth of the benchmark. To initiate, Dynamic-SUPERB features 55 evaluation instances by combining 33 tasks and 22 datasets. This spans a broad spectrum of dimensions, providing a comprehensive platform for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dynamic-superb/dynamic-superb
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis

MethodsFocus