Fast and Fusiest: An Optimal Fusion-Aware Mapper for Accelerator Design
Tanner Andrulis, Michael Gilbert, Vivienne Sze, Joel S. Emer

TL;DR
The paper introduces FFM, a fast mapper that optimally fuses computations in tensor algebra accelerators, significantly reducing energy-delay-product and runtime compared to prior methods.
Contribution
A novel fusion-aware mapper that efficiently searches the fused mapspace, pruning suboptimal mappings to achieve optimal results rapidly.
Findings
Up to 1.8× reduction in energy-delay-product compared to state-of-the-art.
Over 10,000× faster mapping than prior automated mappers.
Reduces EDP by over 2× within the same runtime.
Abstract
A low-latency and energy-efficient tensor algebra accelerator design must optimize how data movement and operations are scheduled (i.e., mapped) in the accelerator architecture. A key mapping optimization is fusion, meaning holding data on-chip between computation steps in the workload, which has been shown to reduce energy and latency by reducing expensive off-chip data movement. However, the optimal fusion choice depends on the workload and workload shape, and a mapper, which searches for the optimal mapping, can improve energy and latency significantly. However, prior mappers cannot find optimal mappings with fusion (i.e., fused mappings) in a feasible runtime because the number of fused mappings to search increases exponentially with the number of computation steps in the workload. In this paper, we introduce the Fast and Fusiest Mapper (FFM), a mapper to quickly find optimal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
