Efficient and Robust Mixed-Integer Optimization Methods for Training Binarized Deep Neural Networks
Jannis Kurtz, Bubacarr Bah

TL;DR
This paper introduces efficient mixed-integer optimization techniques for training binarized deep neural networks, including a global optimality approach, heuristics for efficiency, and a robustness model, demonstrating competitive performance on resource-limited devices.
Contribution
It presents a novel mixed-integer programming formulation for BDNNs, along with heuristics and a robustness model, advancing training methods for resource-efficient neural networks.
Findings
BDNNs can be globally optimized using mixed-integer linear programming.
Heuristics significantly reduce computational complexity.
BDNNs often outperform classical DNNs on small architectures.
Abstract
Compared to classical deep neural networks its binarized versions can be useful for applications on resource-limited devices due to their reduction in memory consumption and computational demands. In this work we study deep neural networks with binary activation functions and continuous or integer weights (BDNN). We show that the BDNN can be reformulated as a mixed-integer linear program with bounded weight space which can be solved to global optimality by classical mixed-integer programming solvers. Additionally, a local search heuristic is presented to calculate locally optimal networks. Furthermore to improve efficiency we present an iterative data-splitting heuristic which iteratively splits the training set into smaller subsets by using the k-mean method. Afterwards all data points in a given subset are forced to follow the same activation pattern, which leads to a much smaller…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Machine Learning and Algorithms · Sparse and Compressive Sensing Techniques
