Comparison Performance of Spectrogram and Scalogram as Input of Acoustic Recognition Task
Dang Thoai Phan

TL;DR
This study compares spectrograms and scalograms as input features for acoustic recognition using CNNs, highlighting their respective strengths, limitations, and suitable application scenarios.
Contribution
It provides a comprehensive performance comparison of spectrogram and scalogram inputs for acoustic recognition, which was lacking in prior research.
Findings
Spectrograms perform better in noisy environments.
Scalograms capture more detailed time-frequency information.
Each method has specific advantages depending on the application scenario.
Abstract
Acoustic recognition has emerged as a prominent task in deep learning research, frequently utilizing spectral feature extraction techniques such as the spectrogram from the Short-Time Fourier Transform and the scalogram from the Wavelet Transform. However, there is a notable deficiency in studies that comprehensively discuss the advantages, drawbacks, and performance comparisons of these methods. This paper aims to evaluate the characteristics of these two transforms as input data for acoustic recognition using Convolutional Neural Networks. The performance of the trained models employing both transforms is documented for comparison. Through this analysis, the paper elucidates the advantages and limitations of each method, provides insights into their respective application scenarios, and identifies potential directions for further research.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEngineering Applied Research · Internet of Things and Social Network Interactions
