Contextual Inverse Optimization: Offline and Online Learning
Omar Besbes, Yuri Fonseca, Ilan Lobel

TL;DR
This paper introduces a new framework for offline and online contextual optimization using feedback on the optimal actions, providing algorithms with logarithmic regret bounds and demonstrating superior performance through simulations.
Contribution
It characterizes the optimal minimax policy in offline settings and develops the first algorithm with logarithmic regret bounds for online contextual optimization.
Findings
Proposed algorithms outperform previous methods in simulations.
Established the performance limits based on data geometry.
Achieved logarithmic regret bounds in online optimization.
Abstract
We study the problems of offline and online contextual optimization with feedback information, where instead of observing the loss, we observe, after-the-fact, the optimal action an oracle with full knowledge of the objective function would have taken. We aim to minimize regret, which is defined as the difference between our losses and the ones incurred by an all-knowing oracle. In the offline setting, the decision-maker has information available from past periods and needs to make one decision, while in the online setting, the decision-maker optimizes decisions dynamically over time based a new set of feasible actions and contextual functions in each period. For the offline setting, we characterize the optimal minimax policy, establishing the performance that can be achieved as a function of the underlying geometry of the information induced by the data. In the online setting, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Contextual Inverse Optimization: Offline and Online Learning· youtube
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Machine Learning and Algorithms
