On the Variance of the Fisher Information for Deep Learning
Alexander Soen, Ke Sun

TL;DR
This paper investigates the variance of empirical Fisher information matrix estimators in deep learning, analyzing how neural network structure influences estimation quality and discussing implications for understanding the loss landscape.
Contribution
It provides a theoretical analysis of the variance of Fisher information estimators in deep neural networks, linking network structure to estimation accuracy.
Findings
Derived closed-form variance expressions for Fisher estimators
Showed network structure impacts estimator variance
Discussed bounds and implications for deep learning analysis
Abstract
In the realm of deep learning, the Fisher information matrix (FIM) gives novel insights and useful tools to characterize the loss landscape, perform second-order optimization, and build geometric learning theories. The exact FIM is either unavailable in closed form or too expensive to compute. In practice, it is almost always estimated based on empirical samples. We investigate two such estimators based on two equivalent representations of the FIM -- both unbiased and consistent. Their estimation quality is naturally gauged by their variance given in closed form. We analyze how the parametric structure of a deep neural network can affect the variance. The meaning of this variance measure and its upper bounds are then discussed in the context of deep learning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsBlind Source Separation Techniques · Sparse and Compressive Sensing Techniques · Statistical Mechanics and Entropy
