Cluster-based pruning techniques for audio data
Boris Bergsma, Marta Brzezinska, Oleg V. Yazyev, Milos Cernak

TL;DR
This paper introduces the novel application of k-means clustering for efficient data pruning in audio datasets, demonstrating significant dataset reduction while maintaining neural network classification performance.
Contribution
It is the first to apply k-means clustering for data pruning in the audio domain, showing its effectiveness in dataset reduction without performance loss.
Findings
K-means clustering can significantly reduce dataset size.
Dataset pruning preserves classification accuracy.
Scaling analysis helps identify optimal pruning strategies.
Abstract
Deep learning models have become widely adopted in various domains, but their performance heavily relies on a vast amount of data. Datasets often contain a large number of irrelevant or redundant samples, which can lead to computational inefficiencies during the training. In this work, we introduce, for the first time in the context of the audio domain, the k-means clustering as a method for efficient data pruning. K-means clustering provides a way to group similar samples together, allowing the reduction of the size of the dataset while preserving its representative characteristics. As an example, we perform clustering analysis on the keyword spotting (KWS) dataset. We discuss how k-means clustering can significantly reduce the size of audio datasets while maintaining the classification performance across neural networks (NNs) with different architectures. We further comment on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis
MethodsPruning · k-Means Clustering
