A Mixed-Integer Programming Approach to Training Dense Neural Networks

Vrishabh Patil; Yonatan Mintz

arXiv:2201.00723·cs.LG·June 27, 2022

A Mixed-Integer Programming Approach to Training Dense Neural Networks

Vrishabh Patil, Yonatan Mintz

PDF

Open Access

TL;DR

This paper introduces a mixed-integer programming method for training dense neural networks that produces more efficient models with competitive performance, addressing training time and memory issues.

Contribution

It presents a novel MIP formulation for training fully-connected ANNs with binary and ReLU activations, enabling more parsimonious models.

Findings

01

Achieves competitive out-of-sample performance

02

Produces more memory-efficient models

03

Demonstrates effectiveness through numerical experiments

Abstract

Artificial Neural Networks (ANNs) are prevalent machine learning models that are applied across various real-world classification tasks. However, training ANNs is time-consuming and the resulting models take a lot of memory to deploy. In order to train more parsimonious ANNs, we propose a novel mixed-integer programming (MIP) formulation for training fully-connected ANNs. Our formulations can account for both binary and rectified linear unit (ReLU) activations, and for the use of a log-likelihood loss. We present numerical experiments comparing our MIP-based methods against existing approaches and show that we are able to achieve competitive out-of-sample performance with more parsimonious models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Adversarial Robustness in Machine Learning

MethodsStochastic Gradient Descent