Soundscapes in Spectrograms: Pioneering Multilabel Classification for South Asian Sounds
Sudip Chakrabarty, Pappu Bishwas, Rajdeep Chatterjee, Tathagata Bandyopadhyay, Digonto Biswas, Bibek Howlader

TL;DR
This paper presents a novel spectrogram-based CNN approach for multilabel classification of South Asian environmental sounds, outperforming traditional MFCC methods and validated on multiple datasets.
Contribution
Introduces a spectrogram-based CNN method for complex multilabel sound classification in South Asia, surpassing MFCC-based techniques.
Findings
Higher classification accuracy on SAS-KIIT dataset
Outperforms MFCC-based methods
Validated on UrbanSound8K dataset
Abstract
Environmental sound classification is a field of growing importance for urban monitoring and cultural soundscape analysis, especially within the acoustically rich environments of South Asia. These regions present a unique challenge as multiple natural, human, and cultural sounds often overlap, straining traditional methods that frequently rely on Mel Frequency Cepstral Coefficients (MFCC). This study introduces a novel spectrogram-based methodology with a superior ability to capture these complex auditory patterns. A Convolutional Neural Network (CNN) architecture is implemented to solve a demanding multilabel, multiclass classification problem on the SAS-KIIT dataset. To demonstrate robustness and comparability, the approach is also validated using the renowned UrbanSound8K dataset. The results confirm that the proposed spectrogram-based method significantly outperforms existing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Noise Effects and Management · Animal Vocal Communication and Behavior
