Information Geometry of Evolution of Neural Network Parameters While   Training

Abhiram Anand Thiruthummal; Eun-jin Kim; Sergiy Shelyag

arXiv:2406.05295·cs.LG·June 11, 2024

Information Geometry of Evolution of Neural Network Parameters While Training

Abhiram Anand Thiruthummal, Eun-jin Kim, Sergiy Shelyag

PDF

Open Access

TL;DR

This paper applies information geometry to analyze neural network training, revealing phase transition-like behavior linked to overfitting, and offers new insights into the interpretability of neural network parameter evolution.

Contribution

It introduces an information geometric framework to study neural network training dynamics and identifies phase transition-like behavior related to overfitting.

Findings

01

Observed transition in parameter evolution correlates with overfitting.

02

Identified phase transition-like behavior using Fisher information metric.

03

Preliminary finite-size scaling results support the phase transition analogy.

Abstract

Artificial neural networks (ANNs) are powerful tools capable of approximating any arbitrary mathematical function, but their interpretability remains limited, rendering them as black box models. To address this issue, numerous methods have been proposed to enhance the explainability and interpretability of ANNs. In this study, we introduce the application of information geometric framework to investigate phase transition-like behavior during the training of ANNs and relate these transitions to overfitting in certain models. The evolution of ANNs during training is studied by looking at the probability distribution of its parameters. Information geometry utilizing the principles of differential geometry, offers a unique perspective on probability and statistics by considering probability density functions as points on a Riemannian manifold. We create this manifold using a metric based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications