Regularizing Deep Neural Networks with Stochastic Estimators of Hessian Trace
Yucong Liu, Shixing Yu, Tong Lin

TL;DR
This paper introduces a new regularization technique for deep neural networks that penalizes the trace of the Hessian matrix, improving generalization and flat minima discovery by using an efficient stochastic estimator.
Contribution
The paper proposes a novel Hessian trace regularizer for deep neural networks, utilizing Hutchinson's estimator with dropout for efficient computation, outperforming existing regularizers.
Findings
Outperforms existing regularizers and data augmentation methods
Enhances generalization by promoting flat minima
Efficient Hessian trace estimation with dropout
Abstract
In this paper, we develop a novel regularization method for deep neural networks by penalizing the trace of Hessian. This regularizer is motivated by a recent guarantee bound of the generalization error. We explain its benefits in finding flat minima and avoiding Lyapunov stability in dynamical systems. We adopt the Hutchinson method as a classical unbiased estimator for the trace of a matrix and further accelerate its calculation using a dropout scheme. Experiments demonstrate that our method outperforms existing regularizers and data augmentation methods, such as Jacobian, Confidence Penalty, Label Smoothing, Cutout, and Mixup.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Stochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques
MethodsLabel Smoothing · Mixup · Cutout · Dropout
