SmartDeal: Re-Modeling Deep Network Weights for Efficient Inference and Training
Xiaohan Chen, Yang Zhao, Yue Wang, Pengfei Xu, Haoran You, Chaojian, Li, Yonggan Fu, Yingyan Lin, Zhangyang Wang

TL;DR
SmartDeal introduces a weight decomposition method that enhances energy efficiency in deep neural network inference and training by reducing data movement and storage requirements through structured sparsity and quantization.
Contribution
The paper proposes a novel weight decomposition framework with structural constraints that significantly improves energy efficiency for DNN inference and training, including a dedicated hardware accelerator.
Findings
Up to 2.44x energy efficiency in inference
10.56x reduction in storage energy during training
4.48x reduction in training energy with negligible accuracy loss
Abstract
The record-breaking performance of deep neural networks (DNNs) comes with heavy parameterization, leading to external dynamic random-access memory (DRAM) for storage. The prohibitive energy of DRAM accesses makes it non-trivial to deploy DNN on resource-constrained devices, calling for minimizing the weight and data movements to improve the energy efficiency. We present SmartDeal (SD), an algorithm framework to trade higher-cost memory storage/access for lower-cost computation, in order to aggressively boost the storage and energy efficiency, for both inference and training. The core of SD is a novel weight decomposition with structural constraints, carefully crafted to unleash the hardware efficiency potential. Specifically, we decompose each weight tensor as the product of a small basis matrix and a large structurally sparse coefficient matrix whose non-zeros are quantized to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Stochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning
