On the Computation of the Fisher Information in Continual Learning
Gido M. van de Ven

TL;DR
This paper compares different implementations of Fisher Information in EWC for continual learning, revealing that many results could improve with better computation methods.
Contribution
It provides an empirical comparison of Fisher Information computation methods in EWC, highlighting the impact on continual learning performance.
Findings
Different Fisher Information implementations yield varying results.
Many existing EWC results may be suboptimal due to computation choices.
Improved Fisher computation can enhance continual learning outcomes.
Abstract
One of the most popular methods for continual learning with deep neural networks is Elastic Weight Consolidation (EWC), which involves computing the Fisher Information. The exact way in which the Fisher Information is computed is however rarely described, and multiple different implementations for it can be found online. This blog post discusses and empirically compares several often-used implementations, which highlights that many currently reported results for EWC could likely be improved by changing the way the Fisher Information is computed.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMachine Learning and ELM · Face and Expression Recognition · Fault Detection and Control Systems
MethodsElastic Weight Consolidation
