Statistical mechanics of continual learning: variational principle and mean-field potential
Chan Li, Zhenye Huang, Wenxuan Zou, Haiping Huang

TL;DR
This paper develops a unified theoretical framework for continual learning in neural networks using variational Bayesian methods and thermodynamic potentials, providing analytical insights and improved algorithms for multi-task learning.
Contribution
It introduces a variational Bayesian approach in field space, translating continual learning into a Franz-Parisi potential framework, and derives a new learning algorithm considering weight uncertainty.
Findings
Predictions match numerical experiments with stochastic gradient descent.
The new algorithm outperforms existing metaplasticity methods.
Framework connects to elastic weight consolidation and neuroscience-inspired learning.
Abstract
An obstacle to artificial general intelligence is set by continual learning of multiple tasks of different nature. Recently, various heuristic tricks, both from machine learning and from neuroscience angles, were proposed, but they lack a unified theory ground. Here, we focus on continual learning in single-layered and multi-layered neural networks of binary weights. A variational Bayesian learning setting is thus proposed, where the neural networks are trained in a field-space, rather than gradient-ill-defined discrete-weight space, and furthermore, weight uncertainty is naturally incorporated, and modulates synaptic resources among tasks. From a physics perspective, we translate the variational continual learning into Franz-Parisi thermodynamic potential framework, where previous task knowledge acts as a prior and a reference as well. We thus interpret the continual learning of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning
