Certified Data Removal Under High-dimensional Settings

Haolin Zou; Arnab Auddy; Yongchan Kwon; Kamiar Rahnama Rad; Arian Maleki

arXiv:2505.07640·stat.ML·May 13, 2025

Certified Data Removal Under High-dimensional Settings

Haolin Zou, Arnab Auddy, Yongchan Kwon, Kamiar Rahnama Rad, Arian Maleki

PDF

Open Access 1 Repo

TL;DR

This paper introduces a high-dimensional data unlearning algorithm that uses a two-step Newton process and noise addition to effectively remove specific data influence from trained models, with theoretical guarantees.

Contribution

It extends data unlearning methods to high-dimensional settings, demonstrating that two Newton steps suffice for certifiable removal, unlike in low-dimensional cases.

Findings

01

Two Newton steps are sufficient for certifiable unlearning in high-dimensional regimes.

02

Adding scaled Laplacian noise ensures complete removal of forgotten data influence.

03

Numerical experiments validate the effectiveness and theoretical guarantees of the proposed method.

Abstract

Machine unlearning focuses on the computationally efficient removal of specific training data from trained models, ensuring that the influence of forgotten data is effectively eliminated without the need for full retraining. Despite advances in low-dimensional settings, where the number of parameters \( p \) is much smaller than the sample size \( n \), extending similar theoretical guarantees to high-dimensional regimes remains challenging. We propose an unlearning algorithm that starts from the original model parameters and performs a theory-guided sequence of Newton steps \( T \in \{ 1,2\}\). After this update, carefully scaled isotropic Laplacian noise is added to the estimate to ensure that any (potential) residual influence of forget data is completely removed. We show that when both \( n, p \to \infty \) with a fixed ratio \( n/p \), significant theoretical and computational…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

krad-zz/Certified-Machine-Unlearning
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis · Machine Learning in Materials Science