Understanding Artificial Neural Network's Behavior from Neuron Activation Perspective
Yizhou Zhang, Yang Sui

TL;DR
This paper introduces a probabilistic framework analyzing neuron activation patterns in deep neural networks, revealing theoretical insights into neural scaling laws, generalization, and loss decay related to dataset size and over-parameterization.
Contribution
It presents a novel probabilistic model linking neuron activation dynamics to neural scaling laws, offering theoretical explanations for empirical phenomena in DNNs.
Findings
Neuron activation increases following a specific mathematical form.
Neuron activation follows a power-law distribution.
Loss decays as dataset size increases, following a power-law relationship.
Abstract
This paper explores the intricate behavior of deep neural networks (DNNs) through the lens of neuron activation dynamics. We propose a probabilistic framework that can analyze models' neuron activation patterns as a stochastic process, uncovering theoretical insights into neural scaling laws, such as over-parameterization and the power-law decay of loss with respect to dataset size. By deriving key mathematical relationships, we present that the number of activated neurons increases in the form of , and the neuron activation should follows power-law distribution. Based on these two mathematical results, we demonstrate how DNNs maintain generalization capabilities even under over-parameterization, and we elucidate the phase transition phenomenon observed in loss curves as dataset size plotted in log-axis (i.e. the data magnitude increases linearly). Moreover, by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
