Better Parameter-free Stochastic Optimization with ODE Updates for Coin-Betting
Keyi Chen, John Langford, Francesco Orabona

TL;DR
This paper introduces a novel parameter-free stochastic optimization algorithm based on ODE updates for coin-betting, which empirically outperforms default SGD and nearly matches tuned baselines without requiring parameter tuning.
Contribution
The paper develops a new parameter-free algorithm using ODE solutions for coin-betting, bridging the empirical gap with tuned SGD in stochastic optimization.
Findings
Outperforms default SGD without tuning
Nearly matches performance of tuned baselines
Uses ODE-based updates for parameter-free optimization
Abstract
Parameter-free stochastic gradient descent (PFSGD) algorithms do not require setting learning rates while achieving optimal theoretical performance. In practical applications, however, there remains an empirical gap between tuned stochastic gradient descent (SGD) and PFSGD. In this paper, we close the empirical gap with a new parameter-free algorithm based on continuous-time Coin-Betting on truncated models. The new update is derived through the solution of an Ordinary Differential Equation (ODE) and solved in a closed form. We show empirically that this new parameter-free algorithm outperforms algorithms with the "best default" learning rates and almost matches the performance of finely tuned baselines without anything to tune.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Neural Networks and Applications · Advanced Bandit Algorithms Research
