Feed-Forward Neural Networks as a Mixed-Integer Program

Navid Aftabi; Nima Moradi; Fatemeh Mahroo

arXiv:2402.06697·cs.LG·February 13, 2024·1 cites

Feed-Forward Neural Networks as a Mixed-Integer Program

Navid Aftabi, Nima Moradi, Fatemeh Mahroo

PDF

Open Access

TL;DR

This paper explores modeling ReLU neural networks as mixed-integer programs and investigates their use in training and evaluating neural network architectures, including binary and binarized models, through experiments on digit classification.

Contribution

It introduces a novel MIP formulation for trained ReLU neurons and applies it to train and analyze various neural network architectures, including binary and binarized DNNs.

Findings

01

MIP formulations effectively model ReLU neurons in neural networks.

02

The approach improves training processes for certain neural network architectures.

03

Experimental results on digit classification demonstrate the method's potential.

Abstract

Deep neural networks (DNNs) are widely studied in various applications. A DNN consists of layers of neurons that compute affine combinations, apply nonlinear operations, and produce corresponding activations. The rectified linear unit (ReLU) is a typical nonlinear operator, outputting the max of its input and zero. In scenarios like max pooling, where multiple input values are involved, a fixed-parameter DNN can be modeled as a mixed-integer program (MIP). This formulation, with continuous variables representing unit outputs and binary variables for ReLU activation, finds applications across diverse domains. This study explores the formulation of trained ReLU neurons as MIP and applies MIP models for training neural networks (NNs). Specifically, it investigates interactions between MIP techniques and various NN architectures, including binary DNNs (employing step activation functions)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications