Scope: A Scalable Merged Pipeline Framework for Multi-Chip-Module NN Accelerators
Zongle Huang, Hongyang Jia, Kaiwei Zou, Yongpan Liu

TL;DR
Scope introduces a merged pipeline framework for multi-chip NN accelerators that jointly optimizes multiple layers, significantly improving throughput and scalability while reducing design complexity.
Contribution
The paper proposes a novel multi-layer merged pipeline framework and efficient search algorithms to enhance MCM NN accelerator performance and scalability.
Findings
Up to 1.73x throughput improvement on ResNet-152
Achieves exponential-to-linear complexity reduction in design space exploration
Maintains similar energy consumption compared to state-of-the-art methods
Abstract
Neural network (NN) accelerators with multi-chip-module (MCM) architectures enable integration of massive computation capability; however, they face challenges of computing resource underutilization and off-chip communication overheads. Traditional parallelization schemes for NN inference on MCM architectures, such as intra-layer parallelism and inter-layer pipelining, show incompetency in breaking through both challenges, limiting the scalability of MCM architectures. We observed that existing works typically deploy layers separately rather than considering them jointly. This underexploited dimension leads to compromises between system computation and communication, thus hindering optimal utilization, especially as hardware/software scale. To address this limitation, we propose Scope, a merged pipeline framework incorporating this overlooked multi-layer dimension, thereby achieving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Embedded Systems Design Techniques · Big Data and Digital Economy
