Data Streaming and Traffic Gathering in Mesh-based NoC for Deep Neural Network Acceleration
Binayak Tiwari, Mei Yang, Xiaohang Wang, Yingtao Jiang

TL;DR
This paper introduces a modified mesh-based NoC architecture with streaming buses and gather packets to efficiently support the complex data traffic patterns of DNN accelerators, improving latency and power consumption.
Contribution
It proposes a novel mesh architecture with streaming buses and gather packets to enhance data traffic support in DNN accelerators.
Findings
Two-way streaming architecture outperforms one-way in latency reduction.
Gather packets significantly decrease runtime latency and power consumption.
Modified mesh supports complex traffic patterns more efficiently.
Abstract
The increasing popularity of deep neural network (DNN) applications demands high computing power and efficient hardware accelerator architecture. DNN accelerators use a large number of processing elements (PEs) and on-chip memory for storing weights and other parameters. As the communication backbone of a DNN accelerator, networks-on-chip (NoC) play an important role in supporting various dataflow patterns and enabling processing with communication parallelism in a DNN accelerator. However, the widely used mesh-based NoC architectures inherently cannot support the efficient one-to-many and many-to-one traffic largely existing in DNN workloads. In this paper, we propose a modified mesh architecture with a one-way/two-way streaming bus to speedup one-to-many (multicast) traffic, and the use of gather packets to support many-to-one (gather) traffic. The analysis of the runtime latency of a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
