On the Computation of the Fisher Information in Continual Learning

Gido M. van de Ven

arXiv:2502.11756·cs.LG·February 18, 2025

On the Computation of the Fisher Information in Continual Learning

Gido M. van de Ven

PDF

Open Access 1 Repo 1 Models 1 Video

TL;DR

This paper compares different implementations of Fisher Information in EWC for continual learning, revealing that many results could improve with better computation methods.

Contribution

It provides an empirical comparison of Fisher Information computation methods in EWC, highlighting the impact on continual learning performance.

Findings

01

Different Fisher Information implementations yield varying results.

02

Many existing EWC results may be suboptimal due to computation choices.

03

Improved Fisher computation can enhance continual learning outcomes.

Abstract

One of the most popular methods for continual learning with deep neural networks is Elastic Weight Consolidation (EWC), which involves computing the Fisher Information. The exact way in which the Fisher Information is computed is however rarely described, and multiple different implementations for it can be found online. This blog post discusses and empirically compares several often-used implementations, which highlights that many currently reported results for EWC could likely be improved by changing the way the Fisher Information is computed.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

GMvandeVen/continual-learning
pytorchOfficial

Models

🤗
jaigouk/qwen3-4b-german-teacher-v1
model· 41 dl
41 dl

Videos

On the Computation of the Fisher Information in Continual Learning· slideslive

Taxonomy

TopicsMachine Learning and ELM · Face and Expression Recognition · Fault Detection and Control Systems

MethodsElastic Weight Consolidation