Finite and Corruption-Robust Regret Bounds in Online Inverse Linear Optimization under M-Convex Action Sets
Taihei Oki, Shinsaku Sakaue

TL;DR
This paper establishes finite, dimension-dependent regret bounds for online inverse linear optimization over M-convex sets, extending to adversarial corruption scenarios with adaptive detection.
Contribution
It proves that finite regret bounds of order O(d log d) are achievable for M-convex feasible sets, resolving an open question in the field.
Findings
Finite regret bound of O(d log d) for M-convex sets.
Extension to adversarially corrupted feedback with regret O((C+1)d log d).
Adaptive corruption detection without prior knowledge of C.
Abstract
We study online inverse linear optimization, also known as contextual recommendation, where a learner sequentially infers an agent's hidden objective vector from observed optimal actions over feasible sets that change over time. The learner aims to recommend actions that perform well under the agent's true objective, and the performance is measured by the regret, defined as the cumulative gap between the agent's optimal values and those achieved by the learner's recommended actions. Prior work has established a regret bound of , as well as a finite but exponentially large bound of , where is the dimension of the optimization problem and is the time horizon, while a regret lower bound of is known (Gollapudi et al. 2021; Sakaue et al. 2025). Whether a finite regret bound polynomial in is achievable or not has remained an open question.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
