RePAIR: Interactive Machine Unlearning through Prompt-Aware Model Repair
Jagadeesh Rachapudi, Pranav Singh, Ritali Vatsi, Praful Hambarde, Amit Shukla

TL;DR
RePAIR introduces an interactive, user-driven framework for selectively unlearning knowledge in large language models at inference time, enhancing control and privacy.
Contribution
It proposes RePAIR, a novel prompt-aware model repair method enabling efficient, on-device unlearning through a training-free, activation manipulation technique.
Findings
RePAIR achieves near-zero forget scores while maintaining high model utility.
The low-rank variant of STAMP reduces computational complexity significantly.
RePAIR outperforms six state-of-the-art baselines in various knowledge suppression tasks.
Abstract
Large language models (LLMs) inherently absorb harmful knowledge, misinformation, and personal data during pretraining on large-scale web corpora, with no native mechanism for selective removal. While machine unlearning offers a principled solution, existing approaches are provider-centric, requiring retraining pipelines, curated retain datasets, and direct intervention by model service providers (MSPs), thereby excluding end users from controlling their own data. We introduce Interactive Machine Unlearning (IMU), a new paradigm in which users can instruct LLMs to forget targeted knowledge through natural language at inference time. To realize IMU, we propose RePAIR, a prompt-aware model repair framework comprising (i) a watchdog model for unlearning intent detection, (ii) a surgeon model for generating repair procedures, and (iii) a patient model whose parameters are updated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
