A 16 nm 1.60TOPS/W High Utilization DNN Accelerator with 3D Spatial Data Reuse and Efficient Shared Memory Access
Xiaoling Yi, Ryan Antonio, Yunhao Deng, Fanchen Kong, Joren Dumoulin, Jun Yin, Marian Verhelst

TL;DR
This paper introduces the Voltra chip, a 16nm DNN accelerator that uses 3D spatial data reuse and flexible shared memory access to significantly improve utilization and energy efficiency across various AI workloads.
Contribution
The paper presents a novel 3D spatial dataflow and shared memory architecture that enhances utilization and efficiency in DNN accelerators, demonstrated by the Voltra chip.
Findings
Up to 2.0x spatial utilization improvement over 2D designs.
Achieves 1.60 TOPS/W energy efficiency in 16nm technology.
Improves temporal utilization by 2.12-2.94x with flexible memory access.
Abstract
Achieving high compute utilization across a wide range of AI workloads is crucial for the efficiency of versatile DNN accelerators. This paper presents the Voltra chip and its utilization-optimised DNN accelerator architecture, which leverages 3-Dimensional (3D) spatial data reuse along with efficient and flexible shared memory access. The 3D spatial dataflow enables balanced spatial data reuse across three dimensions, improving spatial utilization by up to 2.0x compared to a conventional 2D design. Inside the shared memory access architecture, Voltra incorporates flexible data streamers that enable mixed-grained hardware data pre-fetching and dynamic memory allocation, further improving the temporal utilization by 2.12-2.94x and achieving 1.15-2.36x total latency speedup compared with the non-prefetching and separated memory architecture, respectively. Fabricated in 16nm technology,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Advanced Neural Network Applications
