Stochastic Markov Gradient Descent and Training Low-Bit Neural Networks

Jonathan Ashbrock; Alexander M. Powell

arXiv:2008.11117·cs.LG·December 23, 2020

Stochastic Markov Gradient Descent and Training Low-Bit Neural Networks

Jonathan Ashbrock, Alexander M. Powell

PDF

TL;DR

This paper introduces SMGD, a new discrete optimization algorithm tailored for training low-bit neural networks efficiently in memory-constrained environments, with theoretical and numerical validation.

Contribution

The paper presents SMGD, a novel stochastic Markov gradient descent method specifically designed for training quantized neural networks under limited memory conditions.

Findings

01

Theoretical guarantees for SMGD's performance.

02

Numerical results demonstrating effectiveness of SMGD.

03

Applicable to highly memory-constrained training scenarios.

Abstract

The massive size of modern neural networks has motivated substantial recent interest in neural network quantization. We introduce Stochastic Markov Gradient Descent (SMGD), a discrete optimization method applicable to training quantized neural networks. The SMGD algorithm is designed for settings where memory is highly constrained during training. We provide theoretical guarantees of algorithm performance as well as encouraging numerical results.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.