POUR: A Provably Optimal Method for Unlearning Representations via Neural Collapse
Anjie Le, Can Peng, Yuyuan Liu, J. Alison Noble

TL;DR
POUR is a novel method that achieves provably optimal unlearning of representations in neural networks by leveraging Neural Collapse theory and geometric projections, ensuring effective forgetting with minimal impact on retained knowledge.
Contribution
The paper introduces POUR, a new geometric projection-based unlearning method with theoretical guarantees, extending unlearning to the representation level and outperforming existing approaches.
Findings
POUR effectively unlearns specific concepts while preserving knowledge.
It outperforms state-of-the-art methods on CIFAR-10/100 and PathMNIST.
The method provides a provably optimal unlearning operator based on Neural Collapse theory.
Abstract
In computer vision, machine unlearning aims to remove the influence of specific visual concepts or training images without retraining from scratch. Studies show that existing approaches often modify the classifier while leaving internal representations intact, resulting in incomplete forgetting. In this work, we extend the notion of unlearning to the representation level, deriving a three-term interplay between forgetting efficacy, retention fidelity, and class separation. Building on Neural Collapse theory, we show that the orthogonal projection of a simplex Equiangular Tight Frame (ETF) remains an ETF in a lower dimensional space, yielding a provably optimal forgetting operator. We further introduce the Representation Unlearning Score (RUS) to quantify representation-level forgetting and retention fidelity. Building on this, we introduce POUR (Provably Optimal Unlearning of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
