BP-Im2col: Implicit Im2col Supporting AI Backpropagation on Systolic Arrays
Jianchao Yang, Mei Wen, Junzhong Shen, Yasong Cao, Minjin Tang, Renyu, Yang, Jiawei Fei, Chunyuan Zhang

TL;DR
This paper introduces BP-im2col, a novel algorithm that efficiently supports AI backpropagation on systolic array accelerators, significantly reducing runtime, memory bandwidth, and storage overhead compared to traditional methods.
Contribution
The paper proposes BP-im2col, the first im2col algorithm optimized for backpropagation, enabling more efficient training on systolic array-based accelerators.
Findings
Reduces backpropagation runtime by 34.9% on average.
Cuts off-chip memory bandwidth by at least 22.7%.
Decreases storage overhead in backpropagation by at least 74.78%.
Abstract
State-of-the-art systolic array-based accelerators adopt the traditional im2col algorithm to accelerate the inference of convolutional layers. However, traditional im2col cannot efficiently support AI backpropagation. Backpropagation in convolutional layers involves performing transposed convolution and dilated convolution, which usually introduces plenty of zero-spaces into the feature map or kernel. The zero-space data reorganization interfere with the continuity of training and incur additional and non-negligible overhead in terms of off- and on-chip storage, access and performance. Since countermeasures for backpropagation are rarely proposed, we propose BP-im2col, a novel im2col algorithm for AI backpropagation, and implement it in RTL on a TPU-like accelerator. Experiments on TPU-like accelerator indicate that BP-im2col reduces the backpropagation runtime by 34.9% on average, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Error Correcting Code Techniques · Advanced Memory and Neural Computing
