Embedding generalization within the learning dynamics: An approach based-on sample path large deviation theory
Getachew K. Befekadu

TL;DR
This paper introduces a novel approach to enhance generalization in learning models by analyzing the learning dynamics through sample path large deviation theory, linking stochastic gradient processes to optimal control and rare event probabilities.
Contribution
It develops a theoretical framework using Freidlin-Wentzell large deviations to estimate the probability of hitting target loss landscapes, connecting learning dynamics with optimal control and robustness.
Findings
Provides asymptotic probability estimates for rare events in learning dynamics.
Establishes a connection between large deviations and optimal control in model training.
Offers a computational algorithm for optimal point estimation in nonlinear regression.
Abstract
We consider a typical learning problem of point estimations for modeling of nonlinear functions or dynamical systems in which generalization, i.e., verifying a given learned model, can be embedded as an integral part of the learning process or dynamics. In particular, we consider an empirical risk minimization based learning problem that exploits gradient methods from continuous-time perspective with small random perturbations, which is guided by the training dataset loss. Here, we provide an asymptotic probability estimate in the small noise limit based-on the Freidlin-Wentzell theory of large deviations, when the sample path of the random process corresponding to the randomly perturbed gradient dynamical system hits a certain target set, i.e., a rare event, when the latter is specified by the testing dataset loss landscape. Interestingly, the proposed framework can be viewed as one…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
