A Mixed-Integer Programming Approach to Training Dense Neural Networks
Vrishabh Patil, Yonatan Mintz

TL;DR
This paper introduces a mixed-integer programming method for training dense neural networks that produces more efficient models with competitive performance, addressing training time and memory issues.
Contribution
It presents a novel MIP formulation for training fully-connected ANNs with binary and ReLU activations, enabling more parsimonious models.
Findings
Achieves competitive out-of-sample performance
Produces more memory-efficient models
Demonstrates effectiveness through numerical experiments
Abstract
Artificial Neural Networks (ANNs) are prevalent machine learning models that are applied across various real-world classification tasks. However, training ANNs is time-consuming and the resulting models take a lot of memory to deploy. In order to train more parsimonious ANNs, we propose a novel mixed-integer programming (MIP) formulation for training fully-connected ANNs. Our formulations can account for both binary and rectified linear unit (ReLU) activations, and for the use of a log-likelihood loss. We present numerical experiments comparing our MIP-based methods against existing approaches and show that we are able to achieve competitive out-of-sample performance with more parsimonious models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Adversarial Robustness in Machine Learning
MethodsStochastic Gradient Descent
