Compressing Quaternion Convolutional Neural Networks for Audio Classification
Arshdeep Singh, Vinayak Abrol, Mark D. Plumbley

TL;DR
This paper presents methods to compress Quaternion CNNs for audio classification, reducing computational cost and parameters while maintaining high performance across multiple benchmarks.
Contribution
It introduces pruning and knowledge distillation techniques to efficiently compress QCNNs, enabling their deployment on resource-limited platforms.
Findings
Pruned QCNNs reduce computational cost by 50% on AudioSet.
Pruned QCNNs decrease parameter count by 80%.
Maintains competitive performance across diverse audio benchmarks.
Abstract
Conventional Convolutional Neural Networks (CNNs) in the real domain have been widely used for audio classification. However, their convolution operations process multi-channel inputs independently, limiting the ability to capture correlations among channels. This can lead to suboptimal feature learning, particularly for complex audio patterns such as multi-channel spectrogram representations. Quaternion Convolutional Neural Networks (QCNNs) address this limitation by employing quaternion algebra to jointly capture inter-channel dependencies, enabling more compact models with fewer learnable parameters while better exploiting the multi-dimensional nature of audio signals. However, QCNNs exhibit higher computational complexity due to the overhead of quaternion operations, resulting in increased inference latency and reduced efficiency compared to conventional CNNs, posing challenges for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
