Low-complexity acoustic scene classification for multi-device audio:   analysis of DCASE 2021 Challenge systems

Irene Mart\'in-Morat\'o; Toni Heittola; Annamaria Mesaros; Tuomas; Virtanen

arXiv:2105.13734·eess.AS·July 21, 2021·ASE·32 cites

Low-complexity acoustic scene classification for multi-device audio: analysis of DCASE 2021 Challenge systems

Irene Mart\'in-Morat\'o, Toni Heittola, Annamaria Mesaros, Tuomas, Virtanen

PDF

Open Access 1 Repo

TL;DR

This paper analyzes low-complexity acoustic scene classification systems from the DCASE 2021 Challenge, highlighting techniques like residual networks and quantization that improved accuracy over baselines in multi-device audio scenarios.

Contribution

It provides a detailed analysis of various submitted systems, emphasizing the effectiveness of residual networks and quantization in low-complexity acoustic scene classification.

Findings

01

Most submissions outperformed the baseline system.

02

Top systems achieved over 70% accuracy.

03

Quantization and residual networks were common successful techniques.

Abstract

This paper presents the details of Task 1A Acoustic Scene Classification in the DCASE 2021 Challenge. The task targeted development of low-complexity solutions with good generalization properties. The provided baseline system is based on a CNN architecture and post-training quantization of parameters. The system is trained using all the available training data, without any specific technique for handling device mismatch, and obtains an overall accuracy of 47.7%, with a log loss of 1.473. The task received 99 submissions from 30 teams, and most of the submitted systems outperformed the baseline. The most used techniques among the submissions were residual networks and weight quantization, with the top systems reaching over 70% accuracy, and log loss under 0.8. The acoustic scene classification task remained a popular task in the challenge, despite the increasing difficulty of the setup.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mathworks/Baseline-MATLAB-DCASE
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies