Minimizing Surrogate Losses for Decision-Focused Learning using Differentiable Optimization

Jayanta Mandi; Ali \.Irfan Mahmuto\u{g}ullar{\i}; Senne Berden; Tias Guns

arXiv:2508.11365·cs.LG·August 26, 2025

Minimizing Surrogate Losses for Decision-Focused Learning using Differentiable Optimization

Jayanta Mandi, Ali \.Irfan Mahmuto\u{g}ullar{\i}, Senne Berden, Tias Guns

PDF

TL;DR

This paper addresses the challenge of zero gradients in decision-focused learning for linear programs by proposing the use of surrogate losses, enabling more effective training and improved decision quality.

Contribution

It demonstrates that minimizing surrogate losses, rather than regret, improves gradient-based decision-focused learning for LPs, even with differentiable optimization layers like DYS-Net.

Findings

01

Surrogate loss minimization achieves comparable or better regret than direct regret minimization.

02

Using DYS-Net with surrogate losses reduces training time significantly.

03

Surrogate losses enable effective gradient computation where regret gradients are zero.

Abstract

Decision-focused learning (DFL) trains a machine learning (ML) model to predict parameters of an optimization problem, to directly minimize decision regret, i.e., maximize decision quality. Gradient-based DFL requires computing the derivative of the solution to the optimization problem with respect to the predicted parameters. However, for many optimization problems, such as linear programs (LPs), the gradient of the regret with respect to the predicted parameters is zero almost everywhere. Existing gradient-based DFL approaches for LPs try to circumvent this issue in one of two ways: (a) smoothing the LP into a differentiable optimization problem by adding a quadratic regularizer and then minimizing the regret directly or (b) minimizing surrogate losses that have informative (sub)gradients. In this paper, we show that the former approach still results in zero gradients, because even…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.