Studying the Effect of Audio Filters in Pre-Trained Models for Environmental Sound Classification
Aditya Dawn, and Wazib Ansar

TL;DR
This paper introduces a two-level classification approach for environmental sound recognition, incorporating various audio filters including a novel Audio Crop method, achieving high accuracy on the ESC-50 dataset.
Contribution
It proposes a new two-level classification methodology and introduces a novel Audio Crop filter that improves sound classification accuracy.
Findings
Maximum accuracy of 78.75% in Level 1 classification
Maximum accuracy of 98.04% in Level 2 classification
Audio Crop filter outperforms other filters in most cases
Abstract
Environmental Sound Classification is an important problem of sound recognition and is more complicated than speech recognition problems as environmental sounds are not well structured with respect to time and frequency. Researchers have used various CNN models to learn audio features from different audio features like log mel spectrograms, gammatone spectral coefficients, mel-frequency spectral coefficients, generated from the audio files, over the past years. In this paper, we propose a new methodology : Two-Level Classification; the Level 1 Classifier will be responsible to classify the audio signal into a broader class and the Level 2 Classifiers will be responsible to find the actual class to which the audio belongs, based on the output of the Level 1 Classifier. We have also shown the effects of different audio filters, among which a new method of Audio Crop is introduced in this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Water Systems and Optimization
