Combinatorial optimization for low bit-width neural networks

Han Zhou; Aida Ashrafi; Matthew B. Blaschko

arXiv:2206.02006·cs.LG·June 7, 2022

Combinatorial optimization for low bit-width neural networks

Han Zhou, Aida Ashrafi, Matthew B. Blaschko

PDF

Open Access

TL;DR

This paper introduces a combinatorial optimization approach for training low-bit-width neural networks, focusing on binary weights, which offers a potentially hardware-efficient alternative to gradient-based methods.

Contribution

It develops a novel combinatorial optimization method for binary neural networks, reducing reliance on high-performance hardware during training.

Findings

01

Achieves competitive accuracy on binary classification tasks.

02

Offers an $ ext{O}(nd)$ time complexity for linear models.

03

Demonstrates effectiveness of greedy coordinate descent combined with the new approach.

Abstract

Low-bit width neural networks have been extensively explored for deployment on edge devices to reduce computational resources. Existing approaches have focused on gradient-based optimization in a two-stage train-and-compress setting or as a combined optimization where gradients are quantized during training. Such schemes require high-performance hardware during the training phase and usually store an equivalent number of full-precision weights apart from the quantized weights. In this paper, we explore methods of direct combinatorial optimization in the problem of risk minimization with binary weights, which can be made equivalent to a non-monotone submodular maximization under certain conditions. We employ an approximation algorithm for the cases with single and multilayer neural networks. For linear models, it has $O (n d)$ time complexity where $n$ is the sample size and $d$ …

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Advanced Neural Network Applications · Machine Learning and Algorithms