Geometry Perspective Of Estimating Learning Capability Of Neural Networks
Ankan Dutta, Arnab Rakshit

TL;DR
This paper employs geometric and statistical methods to analyze neural network learning capabilities, revealing links between generalization, convergence, and stability, and connecting neural network theory with high-energy physics principles.
Contribution
It introduces a geometric framework to evaluate neural network generalization and stability, and relates these properties to convergence rates and physical principles.
Findings
Higher generalization correlates with slower convergence.
Neural network stability is linked to the Hessian matrix stabilization.
The study connects neural network learning with high-energy physics concepts.
Abstract
The paper uses statistical and differential geometric motivation to acquire prior information about the learning capability of an artificial neural network on a given dataset. The paper considers a broad class of neural networks with generalized architecture performing simple least square regression with stochastic gradient descent (SGD). The system characteristics at two critical epochs in the learning trajectory are analyzed. During some epochs of the training phase, the system reaches equilibrium with the generalization capability attaining a maximum. The system can also be coherent with localized, non-equilibrium states, which is characterized by the stabilization of the Hessian matrix. The paper proves that neural networks with higher generalization capability will have a slower convergence rate. The relationship between the generalization capability with the stability of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Statistical Mechanics and Entropy · Model Reduction and Neural Networks
