Deforming the Loss Surface
Liangming Chen, Long Jin, Xiujuan Du, Shuai Li, and Mei Liu

TL;DR
This paper introduces deformation operators to modify the loss surface in deep learning, leading to flatter minima and improved generalization, demonstrated through theoretical analysis and experiments on multiple datasets.
Contribution
It proposes a novel deformation operator concept and various deformation functions to enhance loss surface optimization and generalization in deep neural networks.
Findings
Deformation functions lead to flatter minima.
Enhanced models outperform original ones on multiple datasets.
The approach improves accuracy with minimal computational overhead.
Abstract
In deep learning, it is usually assumed that the shape of the loss surface is fixed. Differently, a novel concept of deformation operator is first proposed in this paper to deform the loss surface, thereby improving the optimization. Deformation function, as a type of deformation operator, can improve the generalization performance. Moreover, various deformation functions are designed, and their contributions to the loss surface are further provided. Then, the original stochastic gradient descent optimizer is theoretically proved to be a flat minima filter that owns the talent to filter out the sharp minima. Furthermore, the flatter minima could be obtained by exploiting the proposed deformation functions, which is verified on CIFAR-100, with visualizations of loss landscapes near the critical points obtained by both the original optimizer and optimizer enhanced by deformation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
