Deep FlexQP: Accelerated Nonlinear Programming via Deep Unfolding

Alex Oshin; Rahul Vodeb Ghosh; Augustinos D. Saravanos; Evangelos A. Theodorou

arXiv:2512.01565·math.OC·March 6, 2026

Deep FlexQP: Accelerated Nonlinear Programming via Deep Unfolding

Alex Oshin, Rahul Vodeb Ghosh, Augustinos D. Saravanos, Evangelos A. Theodorou

PDF

Open Access 3 Reviews

TL;DR

Deep FlexQP introduces a robust, accelerated deep-unfolded convex quadratic programming solver that effectively handles feasible and infeasible problems, significantly improving speed and safety in nonlinear optimization tasks.

Contribution

The paper presents FlexQP, a novel convex QP solver with provable optimality and infeasibility handling, enhanced by deep unfolding for accelerated performance and tighter generalization bounds.

Findings

01

Outperforms state-of-the-art learned QP solvers on benchmarks.

02

Scales to dense QPs with over 10,000 variables and constraints.

03

Solves nonlinear trajectory optimization 4-16x faster than traditional SQP.

Abstract

We propose FlexQP, an always-feasible convex quadratic programming (QP) solver based on an $ℓ_{1}$ elastic relaxation of the QP constraints. If the original constraints are feasible, FlexQP provably recovers the optimal solution. If the constraints are infeasible, FlexQP identifies a solution that minimizes the constraint violation while keeping the number of violated constraints sparse. Such infeasibilities arise naturally in sequential quadratic programming (SQP) subproblems due to the linearization of the constraints. We prove the convergence of FlexQP under mild coercivity assumptions, making it robust to both feasible and infeasible QPs. We then apply deep unfolding to learn LSTM-based, dimension-agnostic feedback policies for the algorithm parameters, yielding an accelerated Deep FlexQP. To preserve the exactness guarantees of the relaxation, we propose a normalized training loss…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 4Confidence 3

Strengths

1. Accelerating constrained optimization with learning is timely and useful. 2. Primal–dual parameterization + KKT residuals makes the supervision meaningful; the post-refinement stage is practical for polishing errors. 3. On synthetic QP-style tasks, the one-step (or few-step) approach achieves competitive gaps/residuals with favorable wall-clock times.

Weaknesses

1. Limited novelty in core ingredients. Diffusion generation, GNN message passing over factor graphs, and KKT-residual losses are all known; the paper reads as a careful composition/tuning rather than a new algorithmic principle or theory. 2. The paper lacks component-wise ablations that isolate the value of diffusion vs. a non-diffusive predictor, GNN vs. MLP, and KKT loss vs. plain supervised losses, as well as sensitivity to refinement steps and guidance scales. 3. Under what assumptions (e

Reviewer 02Rating 2Confidence 4

Strengths

This paper presents a novel learning-enhanced ADMM framework, supported by theoretical analysis and demonstrated to achieve faster convergence rates compared to established baselines across diverse datasets.

Weaknesses

1. The motivation for introducing slack variables and an ℓ₁-penalty term appears insufficiently justified. Since the ADMM-based solver OSQP can directly solve Problem (1), why not simply accelerate that algorithm using a neural network? Is the intention to use $z_I$ and $z_E$ to determine the feasibility of the original problem? However, in practice, $z_I$ and $z_E$ are unlikely to be exactly zero during iterations, as their values are strongly influenced by how well constraint (4b) can be satis

Reviewer 03Rating 6Confidence 3

Strengths

1. The idea of using a uniformed penalty formulation to treat both feasible and infeasible points within the same objective is novel as it yields a single ADMM-based procedure. 2. Unfolding learns LSTM-based parameter policies while retaining the structure of the original solver, enabling accelerations to the original approach without discarding the algorithmic backbone. 3. The author provides theoretical support, including convergence characterizations of the penalty/ADMM scheme and PAC-Bayes g

Weaknesses

1. The motivation behind and the advantages of using a $l_1$ penalty is not clear. The theory part claims properties of points that solve the problem, but it does not directly establish a guarantee on whether Algorithm 1 and Deep-FlexQP can converge to those feasible/optimal solutions. A detailed explanation would be helpful. 2. The significance of the reported acceleration is unclear. As noted, the dominant cost remains the first ADMM block update, and in some cases Deep FlexQP does not surpas

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRisk and Portfolio Optimization · Advanced Bandit Algorithms Research · Adversarial Robustness in Machine Learning