Efficient Hardware Realization of Convolutional Neural Networks using Intra-Kernel Regular Pruning
Maurice Yang, Mahmoud Faraj, Assem Hussein, Vincent Gaudet

TL;DR
This paper introduces an intra-kernel regular pruning method for CNNs that significantly reduces parameters and computation while maintaining high accuracy, facilitating efficient hardware deployment.
Contribution
The paper presents a novel intra-kernel regular pruning scheme that preserves kernel structure, enabling hardware efficiency and substantial model compression.
Findings
Up to 10x parameter reduction
Up to 7x computational reduction
Less than 1% accuracy degradation
Abstract
The recent trend toward increasingly deep convolutional neural networks (CNNs) leads to a higher demand of computational power and memory storage. Consequently, the deployment of CNNs in hardware has become more challenging. In this paper, we propose an Intra-Kernel Regular (IKR) pruning scheme to reduce the size and computational complexity of the CNNs by removing redundant weights at a fine-grained level. Unlike other pruning methods such as Fine-Grained pruning, IKR pruning maintains regular kernel structures that are exploitable in a hardware accelerator. Experimental results demonstrate up to 10x parameter reduction and 7x computational reduction at a cost of less than 1% degradation in accuracy versus the un-pruned case.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Human Pose and Action Recognition
MethodsPruning
