Robust, General, and Low Complexity Acoustic Scene Classification Systems and An Effective Visualization for Presenting a Sound Scene Context
Lam Pham, Dusan Salovic, Anahid Jalali, Alexander Schindler, Khoa, Tran, Canh Vu, Phu X. Nguyen

TL;DR
This paper introduces a robust, low-complexity acoustic scene classification system with a novel neural network architecture and a visualization method, validated across multiple datasets for real-world applicability.
Contribution
It proposes a new residual-inception neural network architecture for ASC and an effective visualization technique for sound scene context.
Findings
The proposed ASC system achieves high accuracy across various datasets.
The residual-inception model balances complexity and performance effectively.
Sound event information improves scene classification accuracy.
Abstract
In this paper, we present a comprehensive analysis of Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. In particular, we firstly propose an inception-based and low footprint ASC model, referred to as the ASC baseline. The proposed ASC baseline is then compared with benchmark and high-complexity network architectures of MobileNetV1, MobileNetV2, VGG16, VGG19, ResNet50V2, ResNet152V2, DenseNet121, DenseNet201, and Xception. Next, we improve the ASC baseline by proposing a novel deep neural network architecture which leverages residual-inception architectures and multiple kernels. Given the novel residual-inception (NRI) model, we further evaluate the trade off between the model complexity and the model accuracy performance. Finally, we evaluate whether sound events occurring in a sound scene recording can help to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Diverse Musicological Studies · Speech and Audio Processing
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Depthwise Convolution · Pointwise Convolution · Global Average Pooling · Depthwise Separable Convolution · Residual Connection · Convolution · Softmax · 1x1 Convolution · Batch Normalization
