Unifying Sign and Magnitude for Optimizing Deep Vision Networks via ThermoLion

Ahmed Nebli

arXiv:2512.01881·cs.LG·December 3, 2025

Unifying Sign and Magnitude for Optimizing Deep Vision Networks via ThermoLion

Ahmed Nebli

PDF

Open Access

TL;DR

ThermoLion is a novel optimization framework for deep vision models that adaptively switches between low-bit exploration and high-precision exploitation using local SNR gating, improving convergence and accuracy.

Contribution

It introduces ThermoLion, a dynamic optimizer that combines sign and magnitude updates with SNR-based gating and momentum alignment for better vision model training.

Findings

01

Outperforms AdamW and Lion in convergence speed

02

Achieves higher accuracy across 12 vision datasets

03

Demonstrates effective dynamic modulation of gradient updates

Abstract

The training of deep vision models is fundamentally a signal recovery problem amidst high-dimensional stochastic noise. Current optimization paradigms impose a static compromise on information channel capacity. For instance, magnitude-based methods, such as AdamW, operate on the assumption that gradient norms are high-fidelity curvature signals. While this allows for precision in smooth regimes, it leads to catastrophic noise amplification when applied to rugged, non-convex landscapes. Conversely, sign-based methods (e.g., Lion) perform a radical 1-bit quantization of the gradient, which aims to provide robust regularization at the cost of discarding fine-grained descent information. We propose that optimal convergence requires neither static prior, but rather a dynamic modulation of the update bitrate. We introduce ThermoLion, a vision-centric framework that utilizes local…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Stochastic Gradient Optimization Techniques · Age of Information Optimization