Dissecting a Small Artificial Neural Network

Xiguang Yang; Krish Arora; Michael Bachmann

arXiv:2501.08341·cond-mat.dis-nn·January 16, 2025

Dissecting a Small Artificial Neural Network

Xiguang Yang, Krish Arora, Michael Bachmann

PDF

TL;DR

This paper analyzes the loss landscape and convergence behavior of the simplest neural network modeling XOR, revealing how landscape features influence learning dynamics and how adding neurons simplifies the landscape.

Contribution

It introduces a detailed analysis of the loss landscape for a minimal XOR neural network and connects the learning process to phase transitions in statistical physics.

Findings

01

Loss landscape features explain efficient convergence of backpropagation.

02

Adding hidden neurons simplifies the loss landscape, reducing entropic barriers.

03

The network's learning process resembles an annealing phase transition.

Abstract

We investigate the loss landscape and backpropagation dynamics of convergence for the simplest possible artificial neural network representing the logical exclusive-OR (XOR) gate. Cross-sections of the loss landscape in the nine-dimensional parameter space are found to exhibit distinct features, which help understand why backpropagation efficiently achieves convergence toward zero loss, whereas values of weights and biases keep drifting. Differences in shapes of cross-sections obtained by nonrandomized and randomized batches are discussed. In reference to statistical physics we introduce the microcanonical entropy as a unique quantity that allows to characterize the phase behavior of the network. Learning in neural networks can thus be thought of as an annealing process that experiences the analogue of phase transitions known from thermodynamic systems. It also reveals how the loss…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.