Non-Convex Optimization with Spectral Radius Regularization
Adam Sandler, Diego Klabjan, Yuan Luo

TL;DR
This paper introduces a spectral radius regularization technique for deep neural networks that promotes flat minima, leading to better generalization and improved performance across various applications, including healthcare.
Contribution
It proposes a novel regularization method based on spectral radius reduction of the Hessian, along with efficient algorithms and convergence proofs, demonstrating superior generalization in diverse domains.
Findings
Models with spectral radius regularization outperform baselines on test data.
The algorithms converge almost surely and are effective in healthcare applications.
Regularization improves model generalization across different data distributions.
Abstract
We develop regularization methods to find flat minima while training deep neural networks. These minima generalize better than sharp minima, yielding models outperforming baselines on real-world test data (which may be distributed differently than the training data). Specifically, we propose a method of regularized optimization to reduce the spectral radius of the Hessian of the loss function. We also derive algorithms to efficiently optimize neural network models and prove that these algorithms almost surely converge. Furthermore, we demonstrate that our algorithm works effectively on applications in different domains, including healthcare. To show that our models generalize well, we introduced various methods for testing generalizability and found that our models outperform comparable baseline models on these tests.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Advanced Optimization Algorithms Research · Face and Expression Recognition
