Leveraging Second-Order Curvature for Efficient Learned Image Compression: Theory and Empirical Evidence
Yichi Zhang, Fengqing Zhu

TL;DR
This paper demonstrates that using a second-order quasi-Newton optimizer, SOAP, significantly improves training efficiency, stability, and robustness of learned image compression models by resolving gradient conflicts and reducing outliers.
Contribution
The work introduces SOAP, a second-order optimizer for LIC, showing it enhances convergence speed, final performance, and robustness compared to first-order methods.
Findings
SOAP accelerates LIC training and improves rate-distortion performance.
Second-order training reduces activation and latent outliers.
Models trained with SOAP are more robust to post-training quantization.
Abstract
Training learned image compression (LIC) models entails navigating a challenging optimization landscape defined by the fundamental trade-off between rate and distortion. Standard first-order optimizers, such as SGD and Adam, struggle with \emph{gradient conflicts} arising from competing objectives, leading to slow convergence and suboptimal rate-distortion performance. In this work, we demonstrate that a simple utilization of a second-order quasi-Newton optimizer, \textbf{SOAP}, dramatically improves both training efficiency and final performance across diverse LICs. Our theoretical and empirical analyses reveal that Newton preconditioning inherently resolves the intra-step and inter-step update conflicts intrinsic to the R-D objective, facilitating faster, more stable convergence. Beyond acceleration, we uncover a critical deployability benefit: second-order trained models exhibit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Compression Techniques · Stochastic Gradient Optimization Techniques · Advanced Image Processing Techniques
