Machine Learning as Statistical Data Assimilation
H. D. I. Abarbanel, P. J. Rozdeba, S. Shirman

TL;DR
This paper establishes a strong analogy between neural network training and statistical data assimilation, offering new insights into optimization, network design, and the theoretical understanding of deep learning through variational methods.
Contribution
It introduces a formal equivalence between ML and data assimilation, proposing a variational annealing approach for optimal network design and analyzing continuous-layer models as boundary value problems.
Findings
Variational annealing aids in finding global minima in ML.
Continuous layers lead to differential equation formulations.
Back propagation is interpreted via Hamiltonian mechanics.
Abstract
We identify a strong equivalence between neural network based machine learning (ML) methods and the formulation of statistical data assimilation (DA), known to be a problem in statistical physics. DA, as used widely in physical and biological sciences, systematically transfers information in observations to a model of the processes producing the observations. The correspondence is that layer label in the ML setting is the analog of time in the data assimilation setting. Utilizing aspects of this equivalence we discuss how to establish the global minimum of the cost functions in the ML context, using a variational annealing method from DA. This provides a design method for optimal networks for ML applications and may serve as the basis for understanding the success of "deep learning". Results from an ML example are presented. When the layer label is taken to be continuous, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Meteorological Phenomena and Simulations · Computational Physics and Python Applications
