A stochastic three-block splitting algorithm and its application to quantized deep neural networks
Fengmiao Bian, Ren Liu, Xiaoqun Zhang

TL;DR
This paper introduces a stochastic three-block splitting algorithm for training quantized deep neural networks, providing convergence guarantees and demonstrating effectiveness on various architectures and datasets.
Contribution
It proposes a novel stochastic three-block alternating minimization algorithm with proven convergence for non-convex quantized DNN training, and applies it successfully to binary weight networks.
Findings
The STAM algorithm converges to an $$-stationary point with optimal rate.
Experimental results show STAM outperforms classical methods in training quantized DNNs.
STAM achieves higher accuracy on VGG and ResNet models trained on CIFAR datasets.
Abstract
Deep neural networks (DNNs) have made great progress in various fields. In particular, the quantized neural network is a promising technique making DNNs compatible on resource-limited devices for memory and computation saving. In this paper, we mainly consider a non-convex minimization model with three blocks to train quantized DNNs and propose a new stochastic three-block alternating minimization (STAM) algorithm to solve it. We develop a convergence theory for the STAM algorithm and obtain an -stationary point with optimal convergence rate . Furthermore, we apply our STAM algorithm to train DNNs with relaxed binary weights. The experiments are carried out on three different network structures, namely VGG-11, VGG-16 and ResNet-18. These DNNs are trained using two different data sets, CIFAR-10 and CIFAR-100, respectively. We compare our STAM…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Advanced Neural Network Applications · Machine Learning and ELM
