eagle: early approximated gradient based learning rate estimator

Takumi Fujimoto; Hiroaki Nishi

arXiv:2502.01036·cs.LG·February 4, 2025

eagle: early approximated gradient based learning rate estimator

Takumi Fujimoto, Hiroaki Nishi

PDF

Open Access

TL;DR

EAGLE is a novel optimization method that accelerates early training convergence by estimating optimal parameters using parameter and gradient changes, with an adaptive mechanism to switch between EAGLE and Adam for stability.

Contribution

The paper introduces EAGLE, a new gradient-based learning rate estimator that improves early training speed and stability through a dynamic switching mechanism.

Findings

01

EAGLE achieves faster loss convergence in fewer epochs.

02

The adaptive switching mechanism enhances training stability.

03

EAGLE outperforms conventional optimizers on benchmark datasets.

Abstract

We propose EAGLE update rule, a novel optimization method that accelerates loss convergence during the early stages of training by leveraging both current and previous step parameter and gradient values. The update algorithm estimates optimal parameters by computing the changes in parameters and gradients between consecutive training steps and leveraging the local curvature of the loss landscape derived from these changes. However, this update rule has potential instability, and to address that, we introduce an adaptive switching mechanism that dynamically selects between Adam and EAGLE update rules to enhance training stability. Experiments on standard benchmark datasets demonstrate that EAGLE optimizer, which combines this novel update rule with the switching mechanism achieves rapid training loss convergence with fewer epochs, compared to conventional optimization methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Medical Imaging Techniques and Applications