DDFAD: Dataset Distillation Framework for Audio Data

Wenbo Jiang; Rui Zhang; Hongwei Li; Xiaoyuan Liu; Haomiao; Yang; Shui Yu

arXiv:2407.10446·cs.SD·July 23, 2024

DDFAD: Dataset Distillation Framework for Audio Data

Wenbo Jiang, Rui Zhang, Hongwei Li, Xiaoyuan Liu, Haomiao, Yang, Shui Yu

PDF

Open Access

TL;DR

This paper introduces DDFAD, a novel framework for dataset distillation tailored to audio data, enabling significant data compression while maintaining model performance, with potential applications in continual learning and neural architecture search.

Contribution

The paper pioneers the application of dataset distillation to audio data, proposing new feature extraction, distillation, and reconstruction methods specific to audio signals.

Findings

01

DDFAD effectively compresses audio datasets with minimal performance loss.

02

The proposed FD-MFCC features improve distillation quality for audio data.

03

DDFAD shows promising results in various audio datasets and applications.

Abstract

Deep neural networks (DNNs) have achieved significant success in numerous applications. The remarkable performance of DNNs is largely attributed to the availability of massive, high-quality training datasets. However, processing such massive training data requires huge computational and storage resources. Dataset distillation is a promising solution to this problem, offering the capability to compress a large dataset into a smaller distilled dataset. The model trained on the distilled dataset can achieve comparable performance to the model trained on the whole dataset. While dataset distillation has been demonstrated in image data, none have explored dataset distillation for audio data. In this work, for the first time, we propose a Dataset Distillation Framework for Audio Data (DDFAD). Specifically, we first propose the Fused Differential MFCC (FD-MFCC) as extracted features for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech Recognition and Synthesis · Diverse Musicological Studies

MethodsGriffin-Lim Algorithm