Online Decision-Focused Learning

Aymeric Capitaine; Maxime Haddouche; Eric Moulines; Michael I. Jordan; Etienne Boursier; Alain Durmus

arXiv:2505.13564·cs.LG·March 10, 2026

Online Decision-Focused Learning

Aymeric Capitaine, Maxime Haddouche, Eric Moulines, Michael I. Jordan, Etienne Boursier, Alain Durmus

PDF

Open Access 3 Reviews

TL;DR

This paper develops online algorithms for decision-focused learning in dynamic environments, addressing challenges of non-convexity and non-differentiability, and provides theoretical regret bounds with empirical validation.

Contribution

It introduces the first provably guaranteed online algorithms for decision-focused learning in evolving environments, handling non-convex and non-differentiable objectives.

Findings

01

Algorithms outperform benchmarks in knapsack experiments

02

Established static and dynamic regret bounds

03

Effective handling of non-convex, non-differentiable objectives

Abstract

Decision-focused learning (DFL) is an increasingly popular paradigm for training predictive models whose outputs are used in decision-making tasks. Instead of merely optimizing for predictive accuracy, DFL trains models to directly minimize the loss associated with downstream decisions. However, existing studies focus solely on scenarios where a fixed batch of data is available and the objective function does not change over time. We instead investigate DFL in dynamic environments where the objective function and data distribution evolve over time. This setting is challenging for online learning because the objective function has zero or undefined gradients, which prevents the use of standard first-order optimization methods, and is generally non-convex. To address these difficulties, we (i) regularize the objective to make it differentiable and (ii) use perturbation techniques along…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 10Confidence 4

Strengths

This is an excellent paper! * The paper is very well written and includes intuitive explanations along with precise mathematical statements. * The paper tackles a fundamental problem: online decision-focused learning, where the goal is to train predictive models not just for prediction but for using those predictions in a downstream decision-making task. While prior work looked at this problem in the offline setting, this paper studies this problem in the online setting, provides provable regr

Weaknesses

Please see the questions section some weaknesses/questions.

Reviewer 02Rating 6Confidence 3

Strengths

Originality: The originality arises from a proper problem formulation for the regret analysis of decision-focused online optimization, incorporating static and dynamic regret analysis into the said problem and removing limitations of non-differentiability and non-convexity. Quality: The submission seems technically correct, experimentally rigorous and reproducible (except minor caveats in the algorithms). Clarity: The submission is mostly clear. Significance: The submission presents theoretic

Weaknesses

I did not notice any substantial weakness, so I am leaning towards acceptance. One thing to note is that the exact challenge in achieving the said results could be emphasized. Possibly, after proper formulation, everything follows from the existing regret analysis techniques.

Reviewer 03Rating 4Confidence 3

Strengths

* The paper extends the decision-focused framework to online learning, whereas most of the existing literature on the topic is limited to the batch case, where iid samples and a fixed estimation/optimization problem is available. * The authors provide a complete analysis of the problem leveraging state-of-the-art technical algorithms and tools from online learning literature.

Weaknesses

* The technical novelty of the paper is limited. While the authors stress the challenges posed by the online decision-focused learning setting (non-differentiable and non-convex functions), once formulated as in eq.(3) the problem is amenable to any "standard" online learning treatment. Indeed most of the results in the paper are obtained by carefully instantiating known assumptions, algorithms, and theoretical results, such as assumptions H1/H2, FTL and OGD, regularization and approximate oracl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsOnline and Blended Learning

MethodsFocus