The role of optimization geometry in single neuron learning

Nicholas M. Boffi; Stephen Tu; and Jean-Jacques E. Slotine

arXiv:2006.08575·cs.LG·April 25, 2022

The role of optimization geometry in single neuron learning

Nicholas M. Boffi, Stephen Tu, and Jean-Jacques E. Slotine

PDF

Open Access

TL;DR

This paper investigates how the choice of optimization geometry affects the generalization performance in learning models, analyzing simplified models and providing theoretical bounds supported by experiments.

Contribution

It introduces a theoretical analysis of pseudogradient methods for generalized linear models, revealing how optimization geometry influences out-of-sample performance.

Findings

01

Optimization geometry impacts generalization error bounds.

02

Theoretical bounds characterize the interplay between geometry and feature space.

03

Experimental results show improved performance with geometry-informed methods.

Abstract

Recent numerical experiments have demonstrated that the choice of optimization geometry used during training can impact generalization performance when learning expressive nonlinear model classes such as deep neural networks. These observations have important implications for modern deep learning but remain poorly understood due to the difficulty of the associated nonconvex optimization problem. Towards an understanding of this phenomenon, we analyze a family of pseudogradient methods for learning generalized linear models under the square loss - a simplified problem containing both nonlinearity in the model parameters and nonconvexity of the optimization which admits a single neuron as a special case. We prove non-asymptotic bounds on the generalization error that sharply characterize how the interplay between the optimization geometry and the feature space geometry sets the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Neural Networks and Applications · Machine Learning and ELM

MethodsLinear Regression