DAISY: Data Adaptive Self-Supervised Early Exit for Speech   Representation Models

Tzu-Quan Lin; Hung-yi Lee; Hao Tang

arXiv:2406.05464·cs.SD·September 2, 2024

DAISY: Data Adaptive Self-Supervised Early Exit for Speech Representation Models

Tzu-Quan Lin, Hung-yi Lee, Hao Tang

PDF

Open Access 1 Repo

TL;DR

DAISY introduces a data-adaptive early exit method for self-supervised speech models, reducing inference time and computational cost by dynamically adjusting exit points based on input noise levels, without requiring additional training.

Contribution

It proposes a novel early exit strategy that relies on self-supervised loss, avoiding extra training or fine-tuning, and achieves performance comparable to HuBERT with faster inference.

Findings

01

DAISY matches HuBERT's performance on MiniSUPERB.

02

It exits early on clean data and later on noisy data, adapting to input noise levels.

03

DAISY significantly reduces inference time without sacrificing accuracy.

Abstract

Self-supervised speech models have shown to be useful for various tasks, but their large size limits the use in devices with low computing power and memory. In this work, we explore early exit, an approach for reducing latency by exiting the forward process of a network early. Most approaches of early exit need a separate early exit model for each task, with some even requiring fine-tuning of the entire pretrained model. We introduce Data Adaptive Self-Supervised Early Exit (DAISY), an approach that decides when to exit based on the self-supervised loss, eliminating the need for multiple round of training and fine-tuning. DAISY matches the performance of HuBERT on the MiniSUPERB benchmark, but with much faster inference times. Our analysis on the adaptivity of DAISY shows that the model exits early (using fewer layers) on clean data while exits late (using more layers) on noisy data,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nervjack2/DAISY-Data-Adaptive-Self-Supervised-Early-Exit
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Music and Audio Processing