Dynamics of Learning with Restricted Training Sets I: General Theory

A.C.C. Coolen; D. Saad

arXiv:cond-mat/9909402·cond-mat.dis-nn·October 31, 2009

Dynamics of Learning with Restricted Training Sets I: General Theory

A.C.C. Coolen, D. Saad

PDF

TL;DR

This paper develops a theoretical framework using dynamical replica theory to analyze the learning dynamics of single-layer neural networks trained on restricted datasets, revealing spin-glass behavior and predicting performance metrics.

Contribution

It introduces a novel application of dynamical replica theory to model the learning dynamics with limited training data in neural networks.

Findings

01

Predicts evolution of training and generalization errors

02

Shows spin-glass nature of learning dynamics with restricted datasets

03

Extends formalism to finite training set sizes

Abstract

We study the dynamics of supervised learning in layered neural networks, in the regime where the size $p$ of the training set is proportional to the number $N$ of inputs. Here the local fields are no longer described by Gaussian probability distributions and the learning dynamics is of a spin-glass nature, with the composition of the training set playing the role of quenched disorder. We show how dynamical replica theory can be used to predict the evolution of macroscopic observables, including the two relevant performance measures (training error and generalization error), incorporating the old formalism developed for complete training sets in the limit $α = p / N \to \infty$ as a special case. For simplicity we restrict ourselves in this paper to single-layer networks and realizable tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.