Online Structured Laplace Approximations For Overcoming Catastrophic   Forgetting

Hippolyt Ritter; Aleksandar Botev; David Barber

arXiv:1805.07810·stat.ML·May 22, 2018·99 cites

Online Structured Laplace Approximations For Overcoming Catastrophic Forgetting

Hippolyt Ritter, Aleksandar Botev, David Barber

PDF

Open Access

TL;DR

This paper presents a scalable Bayesian online learning method using Kronecker factored Laplace approximations to mitigate catastrophic forgetting in neural networks, achieving high accuracy on permuted MNIST tasks.

Contribution

It introduces a novel Kronecker factored online Laplace approximation method that efficiently approximates the posterior in neural networks for continual learning.

Findings

01

Achieves over 90% test accuracy on permuted MNIST sequence.

02

Outperforms existing methods in overcoming catastrophic forgetting.

03

Scalable approach suitable for modern neural network architectures.

Abstract

We introduce the Kronecker factored online Laplace approximation for overcoming catastrophic forgetting in neural networks. The method is grounded in a Bayesian online learning framework, where we recursively approximate the posterior after every task with a Gaussian, leading to a quadratic penalty on changes to the weights. The Laplace approximation requires calculating the Hessian around a mode, which is typically intractable for modern architectures. In order to make our method scalable, we leverage recent block-diagonal Kronecker factored approximations to the curvature. Our algorithm achieves over 90% test accuracy across a sequence of 50 instantiations of the permuted MNIST dataset, substantially outperforming related methods for overcoming catastrophic forgetting.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Machine Learning and Algorithms