G-TRACER: Expected Sharpness Optimization

John Williams; Stephen Roberts

arXiv:2306.13914·stat.ML·June 27, 2023

G-TRACER: Expected Sharpness Optimization

John Williams, Stephen Roberts

PDF

Open Access

TL;DR

G-TRACER introduces a regularization method that encourages flat minima in deep learning models, improving generalization and performance on challenging datasets by approximating natural-gradient descent.

Contribution

The paper presents G-TRACER, a novel curvature-based regularization scheme that is easy to implement and theoretically grounded, enhancing optimization for deep learning.

Findings

01

Achieves competitive results on vision and NLP benchmarks.

02

Effectively handles low signal-to-noise ratio problems.

03

Converges to a neighborhood of local minima.

Abstract

We propose a new regularization scheme for the optimization of deep learning architectures, G-TRACER ("Geometric TRACE Ratio"), which promotes generalization by seeking flat minima, and has a sound theoretical basis as an approximation to a natural-gradient descent based optimization of a generalized Bayes objective. By augmenting the loss function with a TRACER, curvature-regularized optimizers (eg SGD-TRACER and Adam-TRACER) are simple to implement as modifications to existing optimizers and don't require extensive tuning. We show that the method converges to a neighborhood (depending on the regularization strength) of a local minimum of the unregularized objective, and demonstrate competitive performance on a number of benchmark computer vision and NLP datasets, with a particular focus on challenging low signal-to-noise ratio problems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Advanced Image and Video Retrieval Techniques

MethodsFocus