Improving the efficiency of spectral features extraction by structuring the audio files
Dishant Parikh, Saurabh Sachdev

TL;DR
This paper proposes a dataset formatting method that significantly reduces the computational cost of spectral feature extraction from music clips by enabling accurate analysis with only 10% of the clip processed.
Contribution
It introduces a novel dataset structuring approach that decreases spectral feature extraction time without sacrificing accuracy, and suggests generic durations for specific music types.
Findings
Processing only 10% of clips maintains feature accuracy.
Structured datasets improve spectral feature extraction efficiency.
Method applicable across various music genres.
Abstract
The extraction of spectral features from a music clip is a computationally expensive task. As in order to extract accurate features, we need to process the clip for its whole length. This preprocessing task creates a large overhead and also makes the extraction process slower. We show how formatting a dataset in a certain way, can help make the process more efficient by eliminating the need for processing the clip for its whole duration, and still extract the features accurately. In addition, we discuss the possibility of defining set generic durations for analyzing a certain type of music clip while training. And in doing so we cut down the need of processing the clip duration to just 10% of the global average.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
