You Shall Pass: Dealing with the Zero-Gradient Problem in Predict and   Optimize for Convex Optimization

Grigorii Veviurko; Wendelin B\"ohmer; and Mathijs de Weerdt

arXiv:2307.16304·cs.LG·February 5, 2024·1 cites

You Shall Pass: Dealing with the Zero-Gradient Problem in Predict and Optimize for Convex Optimization

Grigorii Veviurko, Wendelin B\"ohmer, and Mathijs de Weerdt

PDF

Open Access

TL;DR

This paper addresses the zero-gradient problem in predict-and-optimize for convex problems, proposing a smoothing technique and a novel Jacobian approximation method that improve training performance.

Contribution

It introduces a formal proof that smoothing the feasible set resolves the zero-gradient issue in non-linear convex problems and develops a new Jacobian approximation method.

Findings

01

Improved training performance in non-linear convex problems.

02

Matches state-of-the-art in linear problems.

03

Demonstrates effectiveness through simulation experiments.

Abstract

Predict and optimize is an increasingly popular decision-making paradigm that employs machine learning to predict unknown parameters of optimization problems. Instead of minimizing the prediction error of the parameters, it trains predictive models using task performance as a loss function. The key challenge to train such models is the computation of the Jacobian of the solution of the optimization problem with respect to its parameters. For linear problems, this Jacobian is known to be zero or undefined; hence, approximations are usually employed. For non-linear convex problems, however, it is common to use the exact Jacobian. This paper demonstrates that the zero-gradient problem appears in the non-linear case as well -- the Jacobian can have a sizeable null space, thereby causing the training process to get stuck in suboptimal points. Through formal proofs, this paper shows that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques