Improving the Performance of a NoC-based CNN Accelerator with Gather   Support

Binayak Tiwari; Mei Yang; Xiaohang Wang; Yingtao Jiang; Venkatesan; Muthukumar

arXiv:2108.02567·cs.LG·August 6, 2021

Improving the Performance of a NoC-based CNN Accelerator with Gather Support

Binayak Tiwari, Mei Yang, Xiaohang Wang, Yingtao Jiang, Venkatesan, Muthukumar

PDF

TL;DR

This paper proposes using gather packets in mesh-based NoCs with output stationary systolic arrays to efficiently handle many-to-one data traffic in CNN accelerators, improving latency and power consumption.

Contribution

It introduces a gather packet mechanism for NoC-based CNN accelerators to optimize many-to-one traffic handling, enhancing performance over traditional unicast methods.

Findings

01

Reduced latency in CNN data transfer

02

Lower power consumption in NoC communication

03

Improved efficiency with gather packets on AlexNet and VGG-16

Abstract

The increasing application of deep learning technology drives the need for an efficient parallel computing architecture for Convolutional Neural Networks (CNNs). A significant challenge faced when designing a many-core CNN accelerator is to handle the data movement between the processing elements. The CNN workload introduces many-to-one traffic in addition to one-to-one and one-to-many traffic. As the de-facto standard for on-chip communication, Network-on-Chip (NoC) can support various unicast and multicast traffic. For many-to-one traffic, repetitive unicast is employed which is not an efficient way. In this paper, we propose to use the gather packet on mesh-based NoCs employing output stationary systolic array in support of many-to-one traffic. The gather packet will collect the data from the intermediate nodes eventually leading to the destination efficiently. This method is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsConvolution