Incomplete Dot Products for Dynamic Computation Scaling in Neural Network Inference
Bradley McDanel, Surat Teerapittayanon, H.T. Kung

TL;DR
This paper introduces incomplete dot products (IDP), a method allowing neural networks to dynamically adjust computation during inference by selectively using a subset of channels, balancing accuracy and efficiency.
Contribution
The paper proposes IDP with a channel contribution profile, enabling a single network to scale computation dynamically without multiple models, and extends it to multiple profiles for different scaling ranges.
Findings
IDP reduces computation by up to 75% on MNIST and CIFAR-10.
Using 50% IDP on VGG-16 achieves 70% accuracy on CIFAR-10.
IDP maintains accuracy with reduced channels, enabling dynamic inference adjustments.
Abstract
We propose the use of incomplete dot products (IDP) to dynamically adjust the number of input channels used in each layer of a convolutional neural network during feedforward inference. IDP adds monotonically non-increasing coefficients, referred to as a "profile", to the channels during training. The profile orders the contribution of each channel in non-increasing order. At inference time, the number of channels used can be dynamically adjusted to trade off accuracy for lowered power consumption and reduced latency by selecting only a beginning subset of channels. This approach allows for a single network to dynamically scale over a computation range, as opposed to training and deploying multiple networks to support different levels of computation scaling. Additionally, we extend the notion to multiple profiles, each optimized for some specific range of computation scaling. We present…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning
