Unifying Sign and Magnitude for Optimizing Deep Vision Networks via ThermoLion
Ahmed Nebli

TL;DR
ThermoLion is a novel optimization framework for deep vision models that adaptively switches between low-bit exploration and high-precision exploitation using local SNR gating, improving convergence and accuracy.
Contribution
It introduces ThermoLion, a dynamic optimizer that combines sign and magnitude updates with SNR-based gating and momentum alignment for better vision model training.
Findings
Outperforms AdamW and Lion in convergence speed
Achieves higher accuracy across 12 vision datasets
Demonstrates effective dynamic modulation of gradient updates
Abstract
The training of deep vision models is fundamentally a signal recovery problem amidst high-dimensional stochastic noise. Current optimization paradigms impose a static compromise on information channel capacity. For instance, magnitude-based methods, such as AdamW, operate on the assumption that gradient norms are high-fidelity curvature signals. While this allows for precision in smooth regimes, it leads to catastrophic noise amplification when applied to rugged, non-convex landscapes. Conversely, sign-based methods (e.g., Lion) perform a radical 1-bit quantization of the gradient, which aims to provide robust regularization at the cost of discarding fine-grained descent information. We propose that optimal convergence requires neither static prior, but rather a dynamic modulation of the update bitrate. We introduce ThermoLion, a vision-centric framework that utilizes local…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Stochastic Gradient Optimization Techniques · Age of Information Optimization
