Managing Solution Stability in Decision-Focused Learning with Cost Regularization

Victor Spitzer; Francois Sanson

arXiv:2601.21883·cs.LG·January 30, 2026

Managing Solution Stability in Decision-Focused Learning with Cost Regularization

Victor Spitzer, Francois Sanson

PDF

Open Access 3 Reviews

TL;DR

This paper enhances decision-focused learning by analyzing solution stability issues caused by perturbation fluctuations and proposing cost regularization to improve robustness and training effectiveness.

Contribution

It introduces a theoretical link between perturbation fluctuations and solution stability, and proposes a cost regularization method to improve learning robustness.

Findings

01

Regularization improves decision quality in experiments

02

Fluctuations in perturbation intensity affect training stability

03

Proposed method enhances robustness of decision-focused models

Abstract

Decision-focused learning integrates predictive modeling and combinatorial optimization by training models to directly improve decision quality rather than prediction accuracy alone. Differentiating through combinatorial optimization problems represents a central challenge, and recent approaches tackle this difficulty by introducing perturbation-based approximations. In this work, we focus on estimating the objective function coefficients of a combinatorial optimization problem. Our study demonstrates that fluctuations in perturbation intensity occurring during the learning phase can lead to ineffective training, by establishing a theoretical link to the notion of solution stability in combinatorial optimization. We propose addressing this issue by introducing a regularization of the estimated cost vectors which improves the robustness and reliability of the learning process, as…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 4Confidence 2

Strengths

This paper provides a meaningful conceptual clarification and a practical normalization mechanism that helps stabilize a widely used—but often fragile—class of DFL methods. The insight linking stability radius with learning dynamics is both useful and broadly relevant.

Weaknesses

1. I found some notations and definitions are not rigorous in the paper, see Questions. 2. The paper lacks discussion on other perturbed optimizers beyond the MILP case.

Reviewer 02Rating 6Confidence 3

Strengths

1. The writing is generally clear. 2. The viewpoint of interpreting different decision-focused learning methods through the concept of solution stability is novel and interesting.

Weaknesses

1. The introduction and explanation of the four properties in Section 3 could be clearer; adding examples may aid understanding. 2. There are some typos—for example, inconsistent capitalization of the initial letter in “property.”

Reviewer 03Rating 2Confidence 4

Strengths

I think this work makes a very good case for how controlling the scale of perturbations (or of sampling processes) can dramatically affect the behavior of DFL training methods that makes use of such idea (which are at this point many and among the best performers). The discussion on how different classes of method become either ineffective, or collapse to solution imitation, is well done and convincing, even if somewhat informal. I also believe that the proposed normalization technique can be

Weaknesses

The key issue I see in this work is that the proposed approach does not appear to address the analyzed problem. Based on the formulation from eq. (19), the normalization mapping is applied to the parameter vector just before it is fed to the optimization process (the f mapping). In a perturbation based approach, this means that normalization would be applied to the perturbed parameters, after the scale mismatch as already done all the damage extensively documented in the first half of the paper.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Multi-Objective Optimization Algorithms · Gaussian Processes and Bayesian Inference · Advanced Bandit Algorithms Research