Optimization over Trained Neural Networks: Going Large with Gradient-Based Algorithms
Jiatai Tong, Yilin Zhu, Thiago Serra, Samuel Burer

TL;DR
This paper introduces a new gradient-based optimization algorithm for neural network surrogates that reduces per-iteration cost and is especially effective for large models, outperforming existing local search methods.
Contribution
It proposes a novel, lower-cost gradient-based algorithm and adapts it to exploit ReLU neural network structures for improved large-scale optimization.
Findings
The new algorithm reduces per-iteration computational cost.
It outperforms existing local search methods on large models.
It becomes dominant as model size increases.
Abstract
When optimizing a nonlinear objective, one can employ a neural network as a surrogate for the nonlinear function. However, the resulting optimization model can be time-consuming to solve globally with exact methods. As a result, local search that exploits the neural-network structure has been employed to find good solutions within a reasonable time limit. For such methods, a lower per-iteration cost is advantageous when solving larger models. The contribution of this paper is two-fold. First, we propose a gradient-based algorithm with lower per-iteration cost than existing methods. Second, we further adapt this algorithm to exploit the piecewise-linear structure of neural networks that use Rectified Linear Units (ReLUs). In line with prior research, our methods become competitive with -- and then dominant over -- other local search methods as the optimization models become larger.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Neural Networks and Applications · Machine Learning and ELM
