Geometry of Optimization and Implicit Regularization in Deep Learning

Behnam Neyshabur; Ryota Tomioka; Ruslan Salakhutdinov; Nathan Srebro

arXiv:1705.03071·cs.LG·May 10, 2017·89 cites

Geometry of Optimization and Implicit Regularization in Deep Learning

Behnam Neyshabur, Ryota Tomioka, Ruslan Salakhutdinov, Nathan Srebro

PDF

Open Access 1 Repo

TL;DR

This paper explores how the geometry of the parameter space influences implicit regularization and generalization in deep learning, proposing geometry-aware optimization methods to improve model performance.

Contribution

It introduces a geometric perspective on optimization in deep learning and develops an optimization algorithm tailored to this geometry to enhance generalization.

Findings

01

Generalization is governed by implicit regularization, not network size.

02

Changing optimization procedures can improve generalization.

03

Geometry-aware optimization enhances deep learning performance.

Abstract

We argue that the optimization plays a crucial role in generalization of deep learning models through implicit regularization. We do this by demonstrating that generalization ability is not controlled by network size but rather by some other implicit control. We then demonstrate how changing the empirical optimization procedure can improve generalization, even if actual optimization quality is not affected. We do so by studying the geometry of the parameter space of deep networks, and devising an optimization algorithm attuned to this geometry.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bneyshabur/generalization-bounds
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Numerical Analysis Techniques · Image and Object Detection Techniques · Medical Image Segmentation Techniques