On-Chip Communication Network for Efficient Training of Deep   Convolutional Networks on Heterogeneous Manycore Systems

Wonje Choi; Karthi Duraisamy; Ryan Gary Kim; Janardhan Rao Doppa,; Partha Pratim Pande; Diana Marculescu; Radu Marculescu

arXiv:1712.02293·cs.DC·December 7, 2017

On-Chip Communication Network for Efficient Training of Deep Convolutional Networks on Heterogeneous Manycore Systems

Wonje Choi, Karthi Duraisamy, Ryan Gary Kim, Janardhan Rao Doppa,, Partha Pratim Pande, Diana Marculescu, Radu Marculescu

PDF

TL;DR

This paper introduces a hybrid on-chip network architecture combining wireline and wireless links to enhance communication efficiency in CPU-GPU systems for CNN training, resulting in faster, more energy-efficient deep learning model training.

Contribution

It proposes a novel hybrid NoC design tailored for CNN training workloads on heterogeneous manycore systems, significantly reducing latency and energy consumption.

Findings

01

1.8x reduction in network latency

02

2.2x increase in network throughput

03

25% savings in energy-delay-product

Abstract

Convolutional Neural Networks (CNNs) have shown a great deal of success in diverse application domains including computer vision, speech recognition, and natural language processing. However, as the size of datasets and the depth of neural network architectures continue to grow, it is imperative to design high-performance and energy-efficient computing hardware for training CNNs. In this paper, we consider the problem of designing specialized CPU-GPU based heterogeneous manycore systems for energy-efficient training of CNNs. It has already been shown that the typical on-chip communication infrastructures employed in conventional CPU-GPU based heterogeneous manycore platforms are unable to handle both CPU and GPU communication requirements efficiently. To address this issue, we first analyze the on-chip traffic patterns that arise from the computational processes associated with training…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.