Geometry Perspective Of Estimating Learning Capability Of Neural   Networks

Ankan Dutta; Arnab Rakshit

arXiv:2011.04588·cs.LG·December 2, 2020

Geometry Perspective Of Estimating Learning Capability Of Neural Networks

Ankan Dutta, Arnab Rakshit

PDF

Open Access

TL;DR

This paper employs geometric and statistical methods to analyze neural network learning capabilities, revealing links between generalization, convergence, and stability, and connecting neural network theory with high-energy physics principles.

Contribution

It introduces a geometric framework to evaluate neural network generalization and stability, and relates these properties to convergence rates and physical principles.

Findings

01

Higher generalization correlates with slower convergence.

02

Neural network stability is linked to the Hessian matrix stabilization.

03

The study connects neural network learning with high-energy physics concepts.

Abstract

The paper uses statistical and differential geometric motivation to acquire prior information about the learning capability of an artificial neural network on a given dataset. The paper considers a broad class of neural networks with generalized architecture performing simple least square regression with stochastic gradient descent (SGD). The system characteristics at two critical epochs in the learning trajectory are analyzed. During some epochs of the training phase, the system reaches equilibrium with the generalization capability attaining a maximum. The system can also be coherent with localized, non-equilibrium states, which is characterized by the stabilization of the Hessian matrix. The paper proves that neural networks with higher generalization capability will have a slower convergence rate. The relationship between the generalization capability with the stability of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Statistical Mechanics and Entropy · Model Reduction and Neural Networks