Dissecting a Small Artificial Neural Network
Xiguang Yang, Krish Arora, Michael Bachmann

TL;DR
This paper analyzes the loss landscape and convergence behavior of the simplest neural network modeling XOR, revealing how landscape features influence learning dynamics and how adding neurons simplifies the landscape.
Contribution
It introduces a detailed analysis of the loss landscape for a minimal XOR neural network and connects the learning process to phase transitions in statistical physics.
Findings
Loss landscape features explain efficient convergence of backpropagation.
Adding hidden neurons simplifies the loss landscape, reducing entropic barriers.
The network's learning process resembles an annealing phase transition.
Abstract
We investigate the loss landscape and backpropagation dynamics of convergence for the simplest possible artificial neural network representing the logical exclusive-OR (XOR) gate. Cross-sections of the loss landscape in the nine-dimensional parameter space are found to exhibit distinct features, which help understand why backpropagation efficiently achieves convergence toward zero loss, whereas values of weights and biases keep drifting. Differences in shapes of cross-sections obtained by nonrandomized and randomized batches are discussed. In reference to statistical physics we introduce the microcanonical entropy as a unique quantity that allows to characterize the phase behavior of the network. Learning in neural networks can thus be thought of as an annealing process that experiences the analogue of phase transitions known from thermodynamic systems. It also reveals how the loss…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
