Understanding Artificial Neural Network's Behavior from Neuron   Activation Perspective

Yizhou Zhang; Yang Sui

arXiv:2412.18073·cs.AI·December 25, 2024

Understanding Artificial Neural Network's Behavior from Neuron Activation Perspective

Yizhou Zhang, Yang Sui

PDF

Open Access

TL;DR

This paper introduces a probabilistic framework analyzing neuron activation patterns in deep neural networks, revealing theoretical insights into neural scaling laws, generalization, and loss decay related to dataset size and over-parameterization.

Contribution

It presents a novel probabilistic model linking neuron activation dynamics to neural scaling laws, offering theoretical explanations for empirical phenomena in DNNs.

Findings

01

Neuron activation increases following a specific mathematical form.

02

Neuron activation follows a power-law distribution.

03

Loss decays as dataset size increases, following a power-law relationship.

Abstract

This paper explores the intricate behavior of deep neural networks (DNNs) through the lens of neuron activation dynamics. We propose a probabilistic framework that can analyze models' neuron activation patterns as a stochastic process, uncovering theoretical insights into neural scaling laws, such as over-parameterization and the power-law decay of loss with respect to dataset size. By deriving key mathematical relationships, we present that the number of activated neurons increases in the form of $N (1 - (\frac{b N}{D + b N})^{b})$ , and the neuron activation should follows power-law distribution. Based on these two mathematical results, we demonstrate how DNNs maintain generalization capabilities even under over-parameterization, and we elucidate the phase transition phenomenon observed in loss curves as dataset size plotted in log-axis (i.e. the data magnitude increases linearly). Moreover, by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications