Optimal Scalogram for Computational Complexity Reduction in Acoustic Recognition Using Deep Learning

Dang Thoai Phan; Tuan Anh Huynh; Van Tuan Pham; Cao Minh Tran; Van Thuan Mai; Ngoc Quy Tran

arXiv:2505.13017·eess.AS·December 1, 2025

Optimal Scalogram for Computational Complexity Reduction in Acoustic Recognition Using Deep Learning

Dang Thoai Phan, Tuan Anh Huynh, Van Tuan Pham, Cao Minh Tran, Van Thuan Mai, Ngoc Quy Tran

PDF

Open Access

TL;DR

This paper introduces an optimized scalogram method that reduces the computational complexity of the Continuous Wavelet Transform for acoustic recognition, maintaining performance while improving efficiency.

Contribution

It proposes a novel approach to optimize wavelet kernel length and hop size, significantly lowering computational costs of CWT in deep learning-based acoustic recognition.

Findings

01

Reduced computational cost of CWT by optimizing parameters

02

Maintained recognition performance with the new method

03

Demonstrated effectiveness on acoustic recognition tasks

Abstract

The Continuous Wavelet Transform (CWT) is an effective tool for feature extraction in acoustic recognition using Convolutional Neural Networks (CNNs), particularly when applied to non-stationary audio. However, its high computational cost poses a significant challenge, often leading researchers to prefer alternative methods such as the Short-Time Fourier Transform (STFT). To address this issue, this paper proposes a method to reduce the computational complexity of CWT by optimizing the length of the wavelet kernel and the hop size of the output scalogram. Experimental results demonstrate that the proposed approach significantly reduces computational cost while maintaining the robust performance of the trained model in acoustic recognition tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis