Multi-Class Unlearning for Image Classification via Weight Filtering
Samuele Poppi, Sara Sarto, Marcella Cornia, Lorenzo Baraldi, Rita, Cucchiara

TL;DR
This paper introduces a novel multi-class unlearning framework for image classification that uses weight filtering and memory matrices, enabling selective class removal and explainability in neural networks.
Contribution
It presents a new method for unlearning all classes simultaneously using weight filtering, which is more comprehensive than existing class-specific approaches.
Findings
Effective unlearning of all classes in a single round
Applicable to convolutional and Transformer architectures
Provides explainable class representations
Abstract
Machine Unlearning is an emerging paradigm for selectively removing the impact of training datapoints from a network. Unlike existing methods that target a limited subset or a single class, our framework unlearns all classes in a single round. We achieve this by modulating the network's components using memory matrices, enabling the network to demonstrate selective unlearning behavior for any class after training. By discovering weights that are specific to each class, our approach also recovers a representation of the classes which is explainable by design. We test the proposed framework on small- and medium-scale image classification datasets, with both convolution- and Transformer-based backbones, showcasing the potential for explainable solutions through unlearning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Digital Imaging for Blood Diseases · Anomaly Detection Techniques and Applications
MethodsTest
