Hedge algorithm and Dual Averaging schemes
Michel Baes, Michael B\"urgisser

TL;DR
This paper reinterprets the Hedge algorithm within the Dual Averaging framework, introduces improved variants with better convergence, and demonstrates their superior performance through numerical experiments.
Contribution
It provides a new interpretation of the Hedge algorithm as a Dual Averaging scheme and proposes three improved variants with enhanced theoretical and empirical performance.
Findings
Modified methods have better or equal convergence guarantees.
Numerical experiments show significant performance improvements.
New variants require less prior information.
Abstract
We show that the Hedge algorithm, a method that is widely used in Machine Learning, can be interpreted as a particular instance of Dual Averaging schemes, which have recently been introduced by Nesterov for regret minimization. Based on this interpretation, we establish three alternative methods of the Hedge algorithm: one in the form of the original method, but with optimal parameters, one that requires less a priori information, and one that is better adapted to the context of the Hedge algorithm. All our modified methods have convergence results that are better or at least as good as the performance guarantees of the vanilla method. In numerical experiments, our methods significantly outperform the original scheme.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
