Enhancing Fractional Gradient Descent with Learned Optimizers
Jan Sobotka, Petr \v{S}im\'anek, Pavel Kord\'ik

TL;DR
This paper introduces L2O-CFGD, a meta-learning approach that dynamically tunes hyperparameters in fractional gradient descent, improving convergence and performance in complex optimization tasks like neural network training.
Contribution
It presents a novel meta-learning method for adaptive hyperparameter tuning in fractional gradient descent, addressing convergence and hyperparameter scheduling challenges.
Findings
Meta-learned schedule outperforms static hyperparameters.
Achieves performance comparable to black-box meta-learners.
Provides insights into leveraging fractional calculus in optimization.
Abstract
Fractional Gradient Descent (FGD) offers a novel and promising way to accelerate optimization by incorporating fractional calculus into machine learning. Although FGD has shown encouraging initial results across various optimization tasks, it faces significant challenges with convergence behavior and hyperparameter selection. Moreover, the impact of its hyperparameters is not fully understood, and scheduling them is particularly difficult in non-convex settings such as neural network training. To address these issues, we propose a novel approach called Learning to Optimize Caputo Fractional Gradient Descent (L2O-CFGD), which meta-learns how to dynamically tune the hyperparameters of Caputo FGD (CFGD). Our method's meta-learned schedule outperforms CFGD with static hyperparameters found through an extensive search and, in some tasks, achieves performance comparable to a fully black-box…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Metaheuristic Optimization Algorithms Research · Machine Learning and Data Classification
