Adaptive Normalized Risk-Averting Training For Deep Neural Networks
Zhiguang Wang, Tim Oates, James Lo

TL;DR
This paper introduces Adaptive Normalized Risk-Averting Training (ANRAT), a novel approach for training deep neural networks that improves optimization by adaptively learning a convexity parameter, leading to better performance on visual recognition tasks.
Contribution
The paper presents a new error criterion and training method, ANRAT, which adaptively learns a convexity index to enhance deep neural network training and optimization.
Findings
Achieves comparable or superior results to existing methods on MNIST and CIFAR-10.
Improves training of deep neural networks without pretraining or tricks.
Compatible with various network architectures and regularization techniques.
Abstract
This paper proposes a set of new error criteria and learning approaches, Adaptive Normalized Risk-Averting Training (ANRAT), to attack the non-convex optimization problem in training deep neural networks (DNNs). Theoretically, we demonstrate its effectiveness on global and local convexity lower-bounded by the standard -norm error. By analyzing the gradient on the convexity index , we explain the reason why to learn adaptively using gradient descent works. In practice, we show how this method improves training of deep neural networks to solve visual recognition tasks on the MNIST and CIFAR-10 datasets. Without using pretraining or other tricks, we obtain results comparable or superior to those reported in recent literature on the same tasks using standard ConvNets + MSE/cross entropy. Performance on deep/shallow multilayer perceptrons and Denoised Auto-encoders is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning
