Machine Unlearning using Forgetting Neural Networks
Amartya Hatua, Trung T. Nguyen, Filip Cano, Andrew H. Sung

TL;DR
This paper introduces a novel neural network architecture inspired by neuroscience, called Forgetting Neural Networks (FNNs), which effectively unlearns specific data from trained models to enhance privacy and trust.
Contribution
It provides the first concrete implementation of FNNs for targeted unlearning, demonstrating their effectiveness on benchmark datasets and confirming privacy improvements.
Findings
FNNs successfully remove data-specific information from models.
FNN variants preserve performance on retained data.
Membership inference attacks are less effective after unlearning.
Abstract
Modern computer systems store vast amounts of personal data, enabling advances in AI and ML but risking user privacy and trust. For privacy reasons, it is sometimes desired for an ML model to forget part of the data it was trained on. In this paper, we introduce a novel unlearning approach based on Forgetting Neural Networks (FNNs), a neuroscience-inspired architecture that explicitly encodes forgetting through multiplicative decay factors. While FNNs had previously been studied as a theoretical construct, we provide the first concrete implementation and demonstrate their effectiveness for targeted unlearning. We propose several variants with per-neuron forgetting factors, including rank-based assignments guided by activation levels, and evaluate them on MNIST and Fashion-MNIST benchmarks. Our method systematically removes information associated with forget sets while preserving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
