TL;DR
This paper presents a method to effectively implement structured pruning in convolutional neural networks, overcoming architectural issues and achieving energy and inference time improvements on embedded hardware.
Contribution
We introduce a novel approach that enables the use of any structured pruning mask without dimensional discrepancies, enhancing practical deployment of pruned CNNs.
Findings
Reduced energy consumption on embedded devices
Faster inference times for pruned networks
Compatibility with various pruning masks
Abstract
Structured pruning is a popular method to reduce the cost of convolutional neural networks, that are the state of the art in many computer vision tasks. However, depending on the architecture, pruning introduces dimensional discrepancies which prevent the actual reduction of pruned networks. To tackle this problem, we propose a method that is able to take any structured pruning mask and generate a network that does not encounter any of these problems and can be leveraged efficiently. We provide an accurate description of our solution and show results of gains, in energy consumption and inference time on embedded hardware, of pruned convolutional neural networks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPruning
