Optimization over Trained Neural Networks: Going Large with Gradient-Based Algorithms

Jiatai Tong; Yilin Zhu; Thiago Serra; Samuel Burer

arXiv:2512.24295·math.OC·March 19, 2026

Optimization over Trained Neural Networks: Going Large with Gradient-Based Algorithms

Jiatai Tong, Yilin Zhu, Thiago Serra, Samuel Burer

PDF

Open Access

TL;DR

This paper introduces a new gradient-based optimization algorithm for neural network surrogates that reduces per-iteration cost and is especially effective for large models, outperforming existing local search methods.

Contribution

It proposes a novel, lower-cost gradient-based algorithm and adapts it to exploit ReLU neural network structures for improved large-scale optimization.

Findings

01

The new algorithm reduces per-iteration computational cost.

02

It outperforms existing local search methods on large models.

03

It becomes dominant as model size increases.

Abstract

When optimizing a nonlinear objective, one can employ a neural network as a surrogate for the nonlinear function. However, the resulting optimization model can be time-consuming to solve globally with exact methods. As a result, local search that exploits the neural-network structure has been employed to find good solutions within a reasonable time limit. For such methods, a lower per-iteration cost is advantageous when solving larger models. The contribution of this paper is two-fold. First, we propose a gradient-based algorithm with lower per-iteration cost than existing methods. Second, we further adapt this algorithm to exploit the piecewise-linear structure of neural networks that use Rectified Linear Units (ReLUs). In line with prior research, our methods become competitive with -- and then dominant over -- other local search methods as the optimization models become larger.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Neural Networks and Applications · Machine Learning and ELM