The Turbo-Charged Mapper: Fast and Optimal Mapping for Energy-efficient and Low-latency Accelerator Design
Michael Gilbert, Tanner Andrulis, Vivienne Sze, Joel S. Emer

TL;DR
The Turbo-Charged Mapper (TCM) is a novel, fast algorithm that finds optimal mappings for DNN accelerators, significantly reducing search time and improving energy-delay performance by pruning the search space.
Contribution
We introduce dataplacement, a new mapping concept, enabling TCM to perform full search space exploration and find optimal mappings efficiently.
Findings
Reduces search space by up to 32 orders of magnitude.
Improves energy-delay-product by 1.2-6.5 times.
Reduces mapping search time from 5 hours to 17 seconds.
Abstract
The energy and latency of an accelerator running a deep neural network (DNN) depend on how the computation and data movement are scheduled in the accelerator (i.e., mapping), and picking an optimal mapping is essential to achieve high-performance accelerators. However, it is challenging to find mappings that maximize accelerator performance. The space of mappings is large, and prior works cannot guarantee finding optimal mappings because they use heuristics or metaheuristics to narrow the search space. To address this challenge, we propose the Turbo-Charged Mapper (TCM), a fast mapper that finds optimal mappings. The key to our approach is that we define a new mapping concept called dataplacement, which, like the prior concept of dataflow, allows for clear analysis and comparison of mappings. Through it, we identify opportunities to prune redundant and suboptimal mappings, reducing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
