Combinatorial optimization for low bit-width neural networks
Han Zhou, Aida Ashrafi, Matthew B. Blaschko

TL;DR
This paper introduces a combinatorial optimization approach for training low-bit-width neural networks, focusing on binary weights, which offers a potentially hardware-efficient alternative to gradient-based methods.
Contribution
It develops a novel combinatorial optimization method for binary neural networks, reducing reliance on high-performance hardware during training.
Findings
Achieves competitive accuracy on binary classification tasks.
Offers an $ ext{O}(nd)$ time complexity for linear models.
Demonstrates effectiveness of greedy coordinate descent combined with the new approach.
Abstract
Low-bit width neural networks have been extensively explored for deployment on edge devices to reduce computational resources. Existing approaches have focused on gradient-based optimization in a two-stage train-and-compress setting or as a combined optimization where gradients are quantized during training. Such schemes require high-performance hardware during the training phase and usually store an equivalent number of full-precision weights apart from the quantized weights. In this paper, we explore methods of direct combinatorial optimization in the problem of risk minimization with binary weights, which can be made equivalent to a non-monotone submodular maximization under certain conditions. We employ an approximation algorithm for the cases with single and multilayer neural networks. For linear models, it has time complexity where is the sample size and …
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Advanced Neural Network Applications · Machine Learning and Algorithms
