The Case for Asymmetric Systolic Array Floorplanning
C. Peltekis, D. Filippas, G. Dimitrakopoulos, C. Nicopoulos

TL;DR
This paper argues that asymmetric physical layouts of systolic arrays, tailored to data bus asymmetries, can significantly reduce interconnect power and improve energy efficiency in deep learning hardware accelerators.
Contribution
It introduces the concept of asymmetric SAs, demonstrating their advantages over symmetric layouts in reducing power consumption for CNN workloads.
Findings
Asymmetric SAs reduce interconnect power by 9.1%.
Overall power savings of 2.1% are achieved.
Asymmetric layouts better match data bus widths and activity profiles.
Abstract
The widespread proliferation of deep learning applications has triggered the need to accelerate them directly in hardware. General Matrix Multiplication (GEMM) kernels are elemental deep-learning constructs and they inherently map onto Systolic Arrays (SAs). SAs are regular structures that are well-suited for accelerating matrix multiplications. Typical SAs use a pipelined array of Processing Elements (PEs), which communicate with local connections and pre-orchestrated data movements. In this work, we show that the physical layout of SAs should be asymmetric to minimize wirelength and improve energy efficiency. The floorplan of the SA adjusts better to the asymmetric widths of the horizontal and vertical data buses and their switching activity profiles. It is demonstrated that such physically asymmetric SAs reduce interconnect power by 9.1% when executing state-of-the-art Convolutional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · Low-power high-performance VLSI design · Parallel Computing and Optimization Techniques
