Distributionally-Robust Learning to Optimize
Vinit Ranjan, Jisun Park, Bartolomeo Stellato

TL;DR
This paper introduces a distributionally robust framework for learning hyperparameters in convex optimization algorithms, balancing between classical learning to optimize and worst-case optimal design, with proven bounds and strong empirical results.
Contribution
It unifies L2O and worst-case optimal algorithm design through a Wasserstein distributionally robust approach, with a novel solution method and theoretical guarantees.
Findings
Learned algorithms outperform baselines on benchmarks.
Proven bounds relate true risk to in-sample L2O and worst-case bounds.
Framework achieves robustness and strong out-of-sample performance.
Abstract
We propose a distributionally robust approach to learning hyperparameters for first-order methods in convex optimization. Given a dataset of problem instances, we minimize a Wasserstein distributionally robust version of the performance estimation problem (PEP) over algorithm parameters such as step sizes. Our framework unifies two extremes: as the robustness radius vanishes, we recover classical learning to optimize (L2O); as it grows, we recover worst-case optimal algorithm design via PEP. We solve the resulting problem with stochastic gradient descent, differentiating through the solution of an inner semidefinite program at each step. We prove high-probability bounds showing that the true risk of the learned algorithm is at most the in-sample L2O optimum plus a slack that shrinks with the sample size, and is no worse than the worst-case PEP bound. On unconstrained quadratic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
