Variants of RMSProp and Adagrad with Logarithmic Regret Bounds
Mahesh Chandra Mukkamala, Matthias Hein

TL;DR
This paper analyzes RMSProp and introduces two variants, SC-Adagrad and SC-RMSProp, with logarithmic regret bounds, demonstrating improved theoretical guarantees and empirical performance in deep learning and convex optimization.
Contribution
The paper provides the first logarithmic regret bounds for variants of RMSProp and Adagrad in strongly convex settings, along with empirical validation.
Findings
SC-Adagrad and SC-RMSProp achieve logarithmic regret bounds.
The variants outperform existing adaptive methods in experiments.
They improve optimization in both convex functions and deep neural networks.
Abstract
Adaptive gradient methods have become recently very popular, in particular as they have been shown to be useful in the training of deep neural networks. In this paper we have analyzed RMSProp, originally proposed for the training of deep neural networks, in the context of online convex optimization and show -type regret bounds. Moreover, we propose two variants SC-Adagrad and SC-RMSProp for which we show logarithmic regret bounds for strongly convex functions. Finally, we demonstrate in the experiments that these new variants outperform other adaptive gradient techniques or stochastic gradient descent in the optimization of strongly convex functions as well as in training of deep neural networks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques
MethodsRMSProp
