TL;DR
This paper introduces Camuy, a lightweight model for systolic arrays that enables rapid exploration of configurations to optimize deep neural network inference performance on hardware accelerators.
Contribution
The paper presents Camuy, a novel, easy-to-integrate model that helps design and optimize systolic array configurations for diverse neural network architectures.
Findings
Camuy accurately estimates cycles, data movement, and utilization for DNN models.
Design choices in neural networks significantly affect systolic array efficiency.
Camuy facilitates rapid configuration exploration for neural network accelerators.
Abstract
Systolic arrays are a promising computing concept which is in particular inline with CMOS technology trends and linear algebra operations found in the processing of artificial neural networks. The recent success of such deep learning methods in a wide set of applications has led to a variety of models, which albeit conceptual similar as based on convolutions and fully-connected layers, in detail show a huge diversity in operations due to a large design space: An operand's dimension varies substantially since it depends on design principles such as receptive field size, number of features, striding, dilating and grouping of features. Last, recent networks extent previously plain feedforward models by various connectivity, such as in ResNet or DenseNet. The problem of choosing an optimal systolic array configuration cannot be solved analytically, thus instead methods and tools are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsConcatenated Skip Connection · Softmax · *Communicated@Fast*How Do I Communicate to Expedia? · Bottleneck Residual Block · Batch Normalization · Average Pooling · Dropout · 1x1 Convolution · Dense Connections · Max Pooling
