Dynamic Learning Rate Scheduling based on Loss Changes Leads to Faster Convergence

Shreyas Subramanian; Bala Krishnamoorthy; Pranav Murthy

arXiv:2512.14527·cs.AI·December 17, 2025

Dynamic Learning Rate Scheduling based on Loss Changes Leads to Faster Convergence

Shreyas Subramanian, Bala Krishnamoorthy, Pranav Murthy

PDF

Open Access

TL;DR

This paper introduces GreedyLR, a novel adaptive learning rate scheduler that adjusts based on loss changes, leading to faster convergence and improved accuracy across NLP, CV, and LLM tasks.

Contribution

The paper presents GreedyLR, a new adaptive scheduler with theoretical convergence guarantees and practical robustness, outperforming traditional schedulers in various large-scale tasks.

Findings

01

GreedyLR improves convergence speed and accuracy.

02

It outperforms cosine and exponential decay schedulers.

03

Theoretical analysis confirms convergence and robustness.

Abstract

Despite significant advances in optimizers for training, most research works use common scheduler choices like Cosine or exponential decay. In this paper, we study \emph{GreedyLR}, a novel scheduler that adaptively adjusts the learning rate during training based on the current loss. To validate the effectiveness of our proposed scheduler, we conduct experiments on several NLP, CV, and LLM tasks with up to $7 B$ parameters, including both fine-tuning and pre-training experiments. The results show that our approach outperforms several state-of-the-art schedulers in terms of accuracy, speed, and convergence. We also provide a theoretical analysis of the GreedyLR algorithm, including a proof of convergence and derivation of the optimal scaling factor $F$ that maximizes the convergence rate, along with experiments to show robustness of the algorithm to realistic noisy landscapes. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Advanced Bandit Algorithms Research · Domain Adaptation and Few-Shot Learning