A Passive Similarity based CNN Filter Pruning for Efficient Acoustic Scene Classification
Arshdeep Singh, Mark D. Plumbley

TL;DR
This paper introduces a passive filter pruning method based on filter similarity to reduce CNN complexity for acoustic scene classification, achieving significant efficiency gains with minimal accuracy loss.
Contribution
It proposes a cosine distance-based greedy filter pruning framework with a fine-tuning strategy tailored for resource-efficient acoustic scene classification CNNs.
Findings
Reduces computations per inference by 27%.
Fewer parameters by 25%.
Less than 1% accuracy drop.
Abstract
We present a method to develop low-complexity convolutional neural networks (CNNs) for acoustic scene classification (ASC). The large size and high computational complexity of typical CNNs is a bottleneck for their deployment on resource-constrained devices. We propose a passive filter pruning framework, where a few convolutional filters from the CNNs are eliminated to yield compressed CNNs. Our hypothesis is that similar filters produce similar responses and give redundant information allowing such filters to be eliminated from the network. To identify similar filters, a cosine distance based greedy algorithm is proposed. A fine-tuning process is then performed to regain much of the performance lost due to filter elimination. To perform efficient fine-tuning, we analyze how the performance varies as the number of fine-tuning training examples changes. An experimental evaluation of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Speech Recognition and Synthesis
MethodsPruning
