TL;DR
ACCO is an automated optimizer that enhances the efficiency of spatio-temporal CNNs on edge accelerators by jointly exploring scheduling and hardware strategies, significantly improving energy and delay metrics.
Contribution
It introduces a hardware-aware, automated approach for optimizing causal CNN scheduling and transformations specifically for edge hardware accelerators.
Findings
Achieves 8.4x better Energy-Delay-Product over fixed causal structures.
Improves layer-fusion optimality by 20% compared to existing tools.
Provides 19x faster and 37x more energy-efficient scheduling than spatial DF schemes.
Abstract
Spatio-Temporal Convolutional Neural Networks (ST-CNN) allow extending CNN capabilities from image processing to consecutive temporal-pattern recognition. Generally, state-of-the-art (SotA) ST-CNNs inflate the feature maps and weights from well-known CNN backbones to represent the additional time dimension. However, edge computing applications would suffer tremendously from such large computation or memory overhead. Fortunately, the overlapping nature of ST-CNN enables various optimizations, such as the dilated causal convolution structure and Depth-First (DF) layer fusion to reuse the computation between time steps and CNN sliding windows, respectively. Yet, no hardware-aware approach has been proposed that jointly explores the optimal strategy from a scheduling as well as a hardware point of view. To this end, we present ACCO, an automated optimizer that explores efficient Causal CNN…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDilated Causal Convolution · Causal Convolution · Convolution
