Explaining How Deep Neural Networks Forget by Deep Visualization
Giang Nguyen, Shuan Chen, Tae Joon Jun, Daeyoung Kim

TL;DR
This paper introduces a novel visualization tool called CFD to explain catastrophic forgetting in deep neural networks, leading to a new continual learning method called Critical Freezing that improves performance and enhances interpretability.
Contribution
The paper presents CFD for visualizing forgetting in neural networks and proposes Critical Freezing, a new continual learning approach inspired by insights from CFD.
Findings
CFD reveals which network components forget during training.
Critical Freezing outperforms recent continual learning techniques.
The approach enhances both performance and explainability.
Abstract
Explaining the behaviors of deep neural networks, usually considered as black boxes, is critical especially when they are now being adopted over diverse aspects of human life. Taking the advantages of interpretable machine learning (interpretable ML), this paper proposes a novel tool called Catastrophic Forgetting Dissector (or CFD) to explain catastrophic forgetting in continual learning settings. We also introduce a new method called Critical Freezing based on the observations of our tool. Experiments on ResNet articulate how catastrophic forgetting happens, particularly showing which components of this famous network are forgetting. Our new continual learning algorithm defeats various recent techniques by a significant margin, proving the capability of the investigation. Critical freezing not only attacks catastrophic forgetting but also exposes explainability.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
MethodsAverage Pooling · *Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · Batch Normalization · Bottleneck Residual Block · Global Average Pooling · Residual Block · Kaiming Initialization · Max Pooling · Residual Connection
