HERO: Hessian-Enhanced Robust Optimization for Unifying and Improving Generalization and Quantization Performance
Huanrui Yang, Xiaoxuan Yang, Neil Zhenqiang Gong, Yiran Chen

TL;DR
HERO is a novel optimization method that improves neural network generalization and quantization performance by minimizing Hessian eigenvalues, leading to significant accuracy gains and robustness under perturbations.
Contribution
The paper introduces HERO, a Hessian-enhanced robust optimization framework that unifies and improves model generalization and quantization performance through Hessian eigenvalue minimization.
Findings
Up to 3.8% test accuracy improvement.
30% higher accuracy under label perturbation.
Over 10% accuracy gain in post-training quantization.
Abstract
With the recent demand of deploying neural network models on mobile and edge devices, it is desired to improve the model's generalizability on unseen testing data, as well as enhance the model's robustness under fixed-point quantization for efficient deployment. Minimizing the training loss, however, provides few guarantees on the generalization and quantization performance. In this work, we fulfill the need of improving generalization and quantization performance simultaneously by theoretically unifying them under the framework of improving the model's robustness against bounded weight perturbation and minimizing the eigenvalues of the Hessian matrix with respect to model weights. We therefore propose HERO, a Hessian-enhanced robust optimization method, to minimize the Hessian eigenvalues through a gradient-based training process, simultaneously improving the generalization and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Indoor and Outdoor Localization Technologies · Machine Learning and ELM
