PCAS: Pruning Channels with Attention Statistics for Deep Network Compression
Kohei Yamamoto, Kurato Maeno

TL;DR
This paper introduces PCAS, a channel-pruning method using attention statistics that automatically determines the importance of channels, simplifying the compression process for deep neural networks on embedded devices.
Contribution
It proposes a novel attention-based criterion for automatic channel selection, replacing manual per-layer analysis with a single compression ratio for the entire model.
Findings
Achieves higher accuracy than conventional methods
Reduces computational costs significantly
Demonstrates effectiveness across various models and datasets
Abstract
Compression techniques for deep neural networks are important for implementing them on small embedded devices. In particular, channel-pruning is a useful technique for realizing compact networks. However, many conventional methods require manual setting of compression ratios in each layer. It is difficult to analyze the relationships between all layers, especially for deeper models. To address these issues, we propose a simple channel-pruning technique based on attention statistics that enables to evaluate the importance of channels. We improved the method by means of a criterion for automatic channel selection, using a single compression ratio for the entire model in place of per-layer model analysis. The proposed approach achieved superior performance over conventional methods with respect to accuracy and the computational costs for various models and datasets. We provide analysis…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
