Low-complexity acoustic scene classification in DCASE 2022 Challenge

Irene Mart\'in-Morat\'o; Francesco Paissan; Alberto Ancilotto; Toni; Heittola; Annamaria Mesaros; Elisabetta Farella; Alessio Brutti; Tuomas; Virtanen

arXiv:2206.03835·eess.AS·July 14, 2022·ASE·22 cites

Low-complexity acoustic scene classification in DCASE 2022 Challenge

Irene Mart\'in-Morat\'o, Francesco Paissan, Alberto Ancilotto, Toni, Heittola, Annamaria Mesaros, Elisabetta Farella, Alessio Brutti, Tuomas, Virtanen

PDF

Open Access

TL;DR

This paper analyzes low-complexity acoustic scene classification in the DCASE 2022 Challenge, focusing on model size, computational constraints, and performance of various submissions.

Contribution

It provides an overview of the challenge's low-complexity constraints, baseline system design, and comparative analysis of submitted models.

Findings

01

Most submissions outperformed the baseline system.

02

Top systems achieved higher accuracy with similar or fewer parameters.

03

The challenge demonstrated effective low-complexity acoustic scene classification methods.

Abstract

This paper presents an analysis of the Low-Complexity Acoustic Scene Classification task in DCASE 2022 Challenge. The task was a continuation from the previous years, but the low-complexity requirements were changed to the following: the maximum number of allowed parameters, including the zero-valued ones, was 128 K, with parameters being represented using INT8 numerical format; and the maximum number of multiply-accumulate operations at inference time was 30 million. The provided baseline system is a convolutional neural network which employs post-training quantization of parameters, resulting in 46.5 K parameters, and 29.23 million multiply-and-accumulate operations (MMACs). Its performance on the evaluation data is 44.2% accuracy and 1.532 log-loss. In comparison, the top system in the challenge obtained an accuracy of 59.6% and a log loss of 1.091, having 121 K parameters and 28…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech Recognition and Synthesis · Speech and Audio Processing