Optimizing the Whole-life Cost in End-to-end CNN Acceleration
Jiaqi Zhang, Xiangru Chen, Sandip Ray

TL;DR
This paper introduces GCONV Chain, a method that converts entire CNN computations into standard convolutions, improving performance and energy efficiency across various CNNs while reducing overall costs.
Contribution
The paper proposes GCONV Chain, a novel approach to unify CNN layer processing into standard convolutions, enhancing efficiency and generality for end-to-end CNN acceleration.
Findings
GCONV Chain improves CNN acceleration performance by 3.4x on average.
Energy efficiency is increased by 3.2x with GCONV Chain.
The approach reduces developer effort and total ownership costs.
Abstract
The acceleration of CNNs has gained increasing atten-tion since their success in computer vision. With the heterogeneous functional layers that cannot be pro-cessed by the accelerators proposed for convolution layers only, modern end-to-end CNN acceleration so-lutions either transform the diverse computation into matrix/vector arithmetic, which loses data reuse op-portunities in convolution, or introduce dedicated functional unit to each kind of layer, which results in underutilization and high update expense. To enhance the whole-life cost efficiency, we need an acceleration solution that is efficient in processing CNN layers and has the generality to apply to all kinds of existing and emerging layers. To this end, we pro-pose GCONV Chain, a method to convert the entire CNN computation into a chain of standard general convolutions (GCONV) that can be efficiently pro-cessed by the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · Adversarial Robustness in Machine Learning
MethodsConvolution
