Deforming the Loss Surface

Liangming Chen; Long Jin; Xiujuan Du; Shuai Li; and Mei Liu

arXiv:2007.12515·cs.CV·September 15, 2020

Deforming the Loss Surface

Liangming Chen, Long Jin, Xiujuan Du, Shuai Li, and Mei Liu

PDF

Open Access

TL;DR

This paper introduces deformation operators to modify the loss surface in deep learning, leading to flatter minima and improved generalization, demonstrated through theoretical analysis and experiments on multiple datasets.

Contribution

It proposes a novel deformation operator concept and various deformation functions to enhance loss surface optimization and generalization in deep neural networks.

Findings

01

Deformation functions lead to flatter minima.

02

Enhanced models outperform original ones on multiple datasets.

03

The approach improves accuracy with minimal computational overhead.

Abstract

In deep learning, it is usually assumed that the shape of the loss surface is fixed. Differently, a novel concept of deformation operator is first proposed in this paper to deform the loss surface, thereby improving the optimization. Deformation function, as a type of deformation operator, can improve the generalization performance. Moreover, various deformation functions are designed, and their contributions to the loss surface are further provided. Then, the original stochastic gradient descent optimizer is theoretically proved to be a flat minima filter that owns the talent to filter out the sharp minima. Furthermore, the flatter minima could be obtained by exploiting the proposed deformation functions, which is verified on CIFAR-100, with visualizations of loss landscapes near the critical points obtained by both the original optimizer and optimizer enhanced by deformation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning