Descent-Net: Learning Descent Directions for Constrained Optimization
Zisheng Zhou, Dengyu Zheng, Zirui Chen, Shixiang Chen

TL;DR
Descent-Net is a neural network that learns descent directions to improve solutions of constrained optimization problems, ensuring feasibility and better objective values, with demonstrated effectiveness on synthetic and real-world large-scale problems.
Contribution
We introduce Descent-Net, a novel neural network approach that learns descent directions to enhance constrained optimization solutions while maintaining feasibility.
Findings
Effective on synthetic optimization tasks
Successfully applied to AC optimal power flow
Scalable to large portfolio optimization problems
Abstract
Deep learning approaches, known for their ability to model complex relationships and fast execution, are increasingly being applied to solve large optimization problems. However, existing methods often face challenges in simultaneously ensuring feasibility and achieving an optimal objective value. To address this issue, we propose Descent-Net, a neural network designed to learn an effective descent direction from a feasible solution. By updating the solution along this learned direction, Descent-Net improves the objective value while preserving feasibility. Our method demonstrates strong performance on both synthetic optimization tasks and the real-world AC optimal power flow problem, while also exhibiting effective scalability to large problems, as shown by portfolio optimization experiments with thousands of assets.
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
1. The paper studies an important and timely topic, addressing constrained optimization with neural networks. 2. The proposed pipeline is clear and closely follows the standard PGD approach, making it easy to understand the overall structure.
1. The contribution of the paper is vague and not clearly distinguished from prior work. 2. The technical approach is not fully rational or convincing, as the feasibility guarantee for nonlinear, non-convex constraints is not theoretically ensured. 3. The paper is difficult to follow; it would benefit significantly from more illustrative figures and a clearer pipeline description.
1 The paper is well-written and well-organized.
1. The experimental results are weak, as the proposed method shows only marginal improvements over some of the compared approaches. Moreover, it appears that the baseline methods used for comparison are not state-of-the-art for the problems considered in the experimental section. For example, general-purpose solvers based on trust-region methods, such as Fmincon and Knitro, should also be included in the experiments. For problems with smooth objectives and satisfying the LICQ condition, interior
* The submission is an interesting and thoughtful application of the Uniformly Feasible Direction Subproblem within neural networks. * The Descent-Net method convincingly improves upon the optimality performance of DC3 (which provides feasibility but sometimes struggles with optimality) on convex QPs, the simple nonconvex problem class, and the ACOPF 30-bus system. It also improves, albeit more marginally, upon optimality performance for ACOPF 118-bus. This is a significant contribution, as SOTA
* There is a major missing ablation: What happens if the learnable modules $T^k$ are not applied in the descent module, i.e., what if only the original projected subgradient updates are applied? In general, what about other post-hoc iterative refinement strategies (with or without learnable parameters)? * The timing improvements for ACOPF 30-bus and ACOPF 118-bus are more marginal (only 2x faster than traditional solver for ACOPF 30-bus, and only 30% faster for ACOF 118-bus). In addition, for AC
* The paper considers a constrained learning task (where the output of a machine learning model should satisfy constraints), which has received less attention in the literature (although there is a growing body of works in that field) * The paper leverages insights from existing optimization algorithms to inform the design of their ML framework
* The proposed method relies on inverting the matrix $H^{\top}H$, where $H$ is the Jacobian of equality constraints at the current iterate. This is a large obstacle to scalability, as i) this matrix may become dense and numerically ill-conditioned (especially close to the optimum), and (ii) forming and inverting this matrix will become expensive for larger instances. * The convergence result of Theorem 4.2 relies on the universal approximation theorem of ReLU networks. It is an existential resu
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptimal Power Flow Distribution · Advanced Neural Network Applications · Electric Power System Optimization
