Towards Reliable Testing of Machine Unlearning

Anna Mazhar; Sainyam Galhotra

arXiv:2604.16536·cs.LG·April 21, 2026

Towards Reliable Testing of Machine Unlearning

Anna Mazhar, Sainyam Galhotra

PDF

TL;DR

This paper proposes a causal, pathway-centric testing framework for machine unlearning to ensure models no longer rely on deleted data, addressing practical deployment challenges.

Contribution

It introduces a causal fuzzing approach for comprehensive, debuggable, and cost-effective unlearning testing applicable to black-box models.

Findings

01

Standard attribution checks can miss residual influence.

02

Causal testing uncovers proxy and subgroup effects.

03

Proof-of-concept demonstrates effectiveness of the approach.

Abstract

Machine learning components are now central to AI-infused software systems, from recommendations and code assistants to clinical decision support. As regulations and governance frameworks increasingly require deleting sensitive data from deployed models, machine unlearning is emerging as a practical alternative to full retraining. However, unlearning introduces a software quality-assurance challenge: under realistic deployment constraints and imperfect oracles, how can we test that a model no longer relies on targeted information? This paper frames unlearning testing as a first-class software engineering problem. We argue that practical unlearning tests must provide (i) thorough coverage over proxy and mediated influence pathways, (ii) debuggable diagnostics that localize where leakage persists, (iii) cost-effective regression-style execution under query budgets, and (iv) black-box…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.