PAC-Bayesian Learning of Optimization Algorithms

Michael Sucker; Peter Ochs

arXiv:2210.11113·cs.LG·February 16, 2023

PAC-Bayesian Learning of Optimization Algorithms

Michael Sucker, Peter Ochs

PDF

Open Access

TL;DR

This paper introduces a PAC-Bayesian framework for learning optimization algorithms with provable guarantees, balancing convergence probability and speed, and demonstrating empirical improvements over traditional methods.

Contribution

It presents the first PAC-Bayesian approach to learn optimization algorithms with theoretical generalization bounds and explicit convergence trade-offs.

Findings

01

Learned algorithms outperform worst-case bounds.

02

Framework provides probabilistic convergence guarantees.

03

Empirical hyperparameter tuning supports theory.

Abstract

We apply the PAC-Bayes theory to the setting of learning-to-optimize. To the best of our knowledge, we present the first framework to learn optimization algorithms with provable generalization guarantees (PAC-bounds) and explicit trade-off between a high probability of convergence and a high convergence speed. Even in the limit case, where convergence is guaranteed, our learned optimization algorithms provably outperform related algorithms based on a (deterministic) worst-case analysis. Our results rely on PAC-Bayes bounds for general, unbounded loss-functions based on exponential families. By generalizing existing ideas, we reformulate the learning procedure into a one-dimensional minimization problem and study the possibility to find a global minimum, which enables the algorithmic realization of the learning procedure. As a proof-of-concept, we learn hyperparameters of standard…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Stochastic Gradient Optimization Techniques · Advanced Bandit Algorithms Research