Acoustic Scene Classification Using Multichannel Observation with Partially Missing Channels
Keisuke Imoto

TL;DR
This paper examines how partially missing multichannel audio data impacts acoustic scene classification and proposes data augmentation techniques to mitigate performance degradation.
Contribution
It provides a detailed analysis of missing channel effects and introduces simple data augmentation methods to improve classification robustness.
Findings
Missing channels significantly reduce classification accuracy.
Data augmentation improves performance with incomplete data.
Analysis guides better handling of unreliable multichannel audio.
Abstract
Sounds recorded with smartphones or IoT devices often have partially unreliable observations caused by clipping, wind noise, and completely missing parts due to microphone failure and packet loss in data transmission over the network. In this paper, we investigate the impact of the partially missing channels on the performance of acoustic scene classification using multichannel audio recordings, especially for a distributed microphone array. Missing observations cause not only losses of time-frequency and spatial information on sound sources but also a mismatch between a trained model and evaluation data. We thus investigate how a missing channel affects the performance of acoustic scene classification in detail. We also propose simple data augmentation methods for scene classification using multichannel observations with partially missing channels and evaluate the scene classification…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies
