Information Geometry of Evolution of Neural Network Parameters While Training
Abhiram Anand Thiruthummal, Eun-jin Kim, Sergiy Shelyag

TL;DR
This paper applies information geometry to analyze neural network training, revealing phase transition-like behavior linked to overfitting, and offers new insights into the interpretability of neural network parameter evolution.
Contribution
It introduces an information geometric framework to study neural network training dynamics and identifies phase transition-like behavior related to overfitting.
Findings
Observed transition in parameter evolution correlates with overfitting.
Identified phase transition-like behavior using Fisher information metric.
Preliminary finite-size scaling results support the phase transition analogy.
Abstract
Artificial neural networks (ANNs) are powerful tools capable of approximating any arbitrary mathematical function, but their interpretability remains limited, rendering them as black box models. To address this issue, numerous methods have been proposed to enhance the explainability and interpretability of ANNs. In this study, we introduce the application of information geometric framework to investigate phase transition-like behavior during the training of ANNs and relate these transitions to overfitting in certain models. The evolution of ANNs during training is studied by looking at the probability distribution of its parameters. Information geometry utilizing the principles of differential geometry, offers a unique perspective on probability and statistics by considering probability density functions as points on a Riemannian manifold. We create this manifold using a metric based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
