Low-Complexity Acoustic Scene Classification Using Parallel   Attention-Convolution Network

Yanxiong Li; Jiaxin Tan; Guoqing Chen; Jialong Li; Yongjie Si; Qianhua; He

arXiv:2406.08119·eess.AS·June 13, 2024·1 cites

Low-Complexity Acoustic Scene Classification Using Parallel Attention-Convolution Network

Yanxiong Li, Jiaxin Tan, Guoqing Chen, Jialong Li, Yongjie Si, Qianhua, He

PDF

Open Access 1 Repo

TL;DR

This paper introduces a low-complexity acoustic scene classification system using a parallel attention-convolution network, achieving state-of-the-art accuracy with minimal computational resources, suitable for real-time applications.

Contribution

The paper presents a novel parallel attention-convolution network architecture combined with techniques like knowledge distillation and data augmentation for efficient acoustic scene classification.

Findings

01

Achieved 56.10% accuracy on DCASE2023 dataset.

02

Parameter count of 5.21K with 1.44 million MACs.

03

Outperformed top systems in accuracy and complexity.

Abstract

This work is an improved system that we submitted to task 1 of DCASE2023 challenge. We propose a method of low-complexity acoustic scene classification by a parallel attention-convolution network which consists of four modules, including pre-processing, fusion, global and local contextual information extraction. The proposed network is computationally efficient to capture global and local contextual information from each audio clip. In addition, we integrate other techniques into our method, such as knowledge distillation, data augmentation, and adaptive residual normalization. When evaluated on the official dataset of DCASE2023 challenge, our method obtains the highest accuracy of 56.10% with parameter number of 5.21 kilo and multiply-accumulate operations of 1.44 million. It exceeds the top two systems of DCASE2023 challenge in accuracy and complexity, and obtains state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jessytan/low-complexity-asc
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Speech Recognition and Synthesis