Learning Beyond the Gaussian Data: Learning Dynamics of Neural Networks on an Expressive and Cumulant-Controllable Data Model
Onat Ure, Samet Demir, Zafer Dogan

TL;DR
This paper investigates how high-order statistical properties of data influence neural network learning dynamics, using a controllable non-Gaussian data model to reveal a progressive learning of statistical moments.
Contribution
It introduces a generative two-layer neural network data model with controllable cumulants, enabling analysis of high-order statistics effects on neural network training.
Findings
Networks learn low-order statistics first, then high-order cumulants.
Controllable data model allows systematic study of distributional effects.
Pretraining on real data validates the model's practical relevance.
Abstract
We study the effect of high-order statistics of data on the learning dynamics of neural networks (NNs) by using a moment-controllable non-Gaussian data model. Considering the expressivity of two-layer neural networks, we first construct the data model as a generative two-layer NN where the activation function is expanded by using Hermite polynomials. This allows us to achieve interpretable control over high-order cumulants such as skewness and kurtosis through the Hermite coefficients while keeping the data model realistic. Using samples generated from the data model, we perform controlled online learning experiments with a two-layer NN. Our results reveal a moment-wise progression in training: networks first capture low-order statistics such as mean and covariance, and progressively learn high-order cumulants. Finally, we pretrain the generative model on the Fashion-MNIST dataset and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Gaussian Processes and Bayesian Inference · Quantum many-body systems
