E-PANNs: Sound Recognition Using Efficient Pre-trained Audio Neural   Networks

Arshdeep Singh; Haohe Liu; Mark D. Plumbley

arXiv:2305.18665·cs.SD·May 31, 2023·2 cites

E-PANNs: Sound Recognition Using Efficient Pre-trained Audio Neural Networks

Arshdeep Singh, Haohe Liu, Mark D. Plumbley

PDF

Open Access 1 Repo

TL;DR

This paper introduces E-PANNs, an optimized version of pre-trained audio neural networks that reduces computational and memory demands by 36% and 70% respectively, while slightly improving sound recognition accuracy.

Contribution

The paper presents a pruning-based method to significantly reduce the complexity and size of PANNs, enabling deployment on resource-constrained devices with improved performance.

Findings

01

E-PANNs requires 36% less computation.

02

E-PANNs uses 70% less memory.

03

E-PANNs slightly outperforms original PANNs in sound recognition.

Abstract

Sounds carry an abundance of information about activities and events in our everyday environment, such as traffic noise, road works, music, or people talking. Recent machine learning methods, such as convolutional neural networks (CNNs), have been shown to be able to automatically recognize sound activities, a task known as audio tagging. One such method, pre-trained audio neural networks (PANNs), provides a neural network which has been pre-trained on over 500 sound classes from the publicly available AudioSet dataset, and can be used as a baseline or starting point for other tasks. However, the existing PANNs model has a high computational complexity and large storage requirement. This could limit the potential for deploying PANNs on resource-constrained devices, such as on-the-edge sound sensors, and could lead to high energy consumption if many such devices were deployed. In this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

arshdeep-singh-boparai/e-panns
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies

MethodsPruning