A Solver + Gradient Descent Training Algorithm for Deep Neural Networks
Dhananjay Ashok, Vineel Nagisetty, Christopher Srinivasa, Vijay, Ganesh

TL;DR
This paper introduces GDSolver, a hybrid training algorithm for deep neural networks that combines gradient descent with MILP solving to escape local minima, resulting in improved accuracy and efficiency.
Contribution
The paper presents a novel hybrid training algorithm that integrates gradient descent with MILP to enhance deep neural network training performance.
Findings
GDSolver achieves 31.5% lower MSE in less time for regression.
GDSolver attains higher accuracy on MNIST and CIFAR10 with half the training data.
The hybrid method scales well to large datasets and models.
Abstract
We present a novel hybrid algorithm for training Deep Neural Networks that combines the state-of-the-art Gradient Descent (GD) method with a Mixed Integer Linear Programming (MILP) solver, outperforming GD and variants in terms of accuracy, as well as resource and data efficiency for both regression and classification tasks. Our GD+Solver hybrid algorithm, called GDSolver, works as follows: given a DNN as input, GDSolver invokes GD to partially train until it gets stuck in a local minima, at which point GDSolver invokes an MILP solver to exhaustively search a region of the loss landscape around the weight assignments of 's final layer parameters with the goal of tunnelling through and escaping the local minima. The process is repeated until desired accuracy is achieved. In our experiments, we find that GDSolver not only scales well to additional data and very large model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification
