Divide and Conquer: Leveraging Intermediate Feature Representations for Quantized Training of Neural Networks
Ahmed T. Elthakeb, Prannoy Pilligundla, Alex Cloninger, Hadi, Esmaeilzadeh

TL;DR
This paper introduces DCQ, a divide-and-conquer approach that leverages intermediate feature representations for quantized training of neural networks, significantly reducing memory and computation while maintaining accuracy.
Contribution
The paper proposes a novel sectional knowledge distillation method that trains quantized network sections independently and stitches them together, improving quantization performance.
Findings
DCQ improves binary quantization accuracy by 21.6%.
DCQ enhances ternary quantization accuracy by 9.3%.
Incorporating DCQ boosts existing quantization methods' performance.
Abstract
The deep layers of modern neural networks extract a rather rich set of features as an input propagates through the network. This paper sets out to harvest these rich intermediate representations for quantization with minimal accuracy loss while significantly reducing the memory footprint and compute intensity of the DNN. This paper utilizes knowledge distillation through teacher-student paradigm (Hinton et al., 2015) in a novel setting that exploits the feature extraction capability of DNNs for higher-accuracy quantization. As such, our algorithm logically divides a pretrained full-precision DNN to multiple sections, each of which exposes intermediate features to train a team of students independently in the quantized domain. This divide and conquer strategy, in fact, makes the training of each student section possible in isolation while all these independently trained sections are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning
MethodsKnowledge Distillation · Convolution · Dense Connections · LeNet
