On Sequential Bayesian Inference for Continual Learning
Samuel Kessler, Adam Cobb, Tim G. J. Rudner, Stefan Zohren, Stephen J., Roberts

TL;DR
This paper critically examines the effectiveness of sequential Bayesian inference in continual learning, revealing its limitations due to model misspecification and data imbalance, and proposes a new baseline method for improved performance.
Contribution
The paper demonstrates the failure of sequential Bayesian inference to prevent catastrophic forgetting and introduces Prototypical Bayesian Continual Learning as a competitive alternative.
Findings
Sequential Bayesian inference fails to prevent forgetting in neural networks.
Model misspecification can lead to sub-optimal continual learning performance.
Task data imbalance contributes to forgetting.
Abstract
Sequential Bayesian inference can be used for continual learning to prevent catastrophic forgetting of past tasks and provide an informative prior when learning new tasks. We revisit sequential Bayesian inference and test whether having access to the true posterior is guaranteed to prevent catastrophic forgetting in Bayesian neural networks. To do this we perform sequential Bayesian inference using Hamiltonian Monte Carlo. We propagate the posterior as a prior for new tasks by fitting a density estimator on Hamiltonian Monte Carlo samples. We find that this approach fails to prevent catastrophic forgetting demonstrating the difficulty in performing sequential Bayesian inference in neural networks. From there we study simple analytical examples of sequential Bayesian inference and CL and highlight the issue of model misspecification which can lead to sub-optimal continual learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Memory Processes and Influences
MethodsTest
