Module-Aware Parameter-Efficient Machine Unlearning on Transformers

Wenjie Bao; Jian Lou; Yuke Hu; Xiaochen Li; Zhihao Liu; Jiaqi Liu; Zhan Qin; Kui Ren

arXiv:2508.17233·cs.LG·August 26, 2025

Module-Aware Parameter-Efficient Machine Unlearning on Transformers

Wenjie Bao, Jian Lou, Yuke Hu, Xiaochen Li, Zhihao Liu, Jiaqi Liu, Zhan Qin, Kui Ren

PDF

4 Reviews

TL;DR

This paper introduces MAPE-Unlearn, a module-aware method for efficient machine unlearning in Transformers, accurately identifying influence-critical parameters to improve unlearning performance while maintaining efficiency.

Contribution

The paper presents a novel module-aware approach with learnable masks for precise parameter identification, enhancing unlearning effectiveness in Transformer models.

Findings

01

Effective unlearning across various Transformer models.

02

Robustness demonstrated through extensive experiments.

03

Improved accuracy over module-oblivious methods.

Abstract

Transformer has become fundamental to a vast series of pre-trained large models that have achieved remarkable success across diverse applications. Machine unlearning, which focuses on efficiently removing specific data influences to comply with privacy regulations, shows promise in restricting updates to influence-critical parameters. However, existing parameter-efficient unlearning methods are largely devised in a module-oblivious manner, which tends to inaccurately identify these parameters and leads to inferior unlearning performance for Transformers. In this paper, we propose {\tt MAPE-Unlearn}, a module-aware parameter-efficient machine unlearning approach that uses a learnable pair of masks to pinpoint influence-critical parameters in the heads and filters of Transformers. The learning objective of these masks is derived by desiderata of unlearning and optimized through an…

Peer Reviews

Decision·ICLR 2026 Conference Withdrawn Submission

Reviewer 01Rating 2Confidence 4

Strengths

1. The paper introduces a novel module-aware masking strategy that updates only influential attention heads and feed-forward filters. This design reduces computational cost while preserving unlearning effectiveness. 2. Across GLUE, SQuAD, TOFU, and hazardous knowledge tasks, MAPE-Unlearn consistently outperforms baselines by achieving effective forgetting with minimal degradation in model fidelity. 3. The method shows clear advantages under successive unlearning and relearning attacks—two realis

Weaknesses

1. The discussion of PEFT-related unlearning methods is not up-to-date and omits several recent works such as [1-3]. This makes the positioning of the paper less convincing and weakens its contribution relative to current literature. 2. The paper does not clearly articulate why unlearning in Transformers poses unique challenges for PEFT-based methods. It also fails to explicitly map these challenges to its proposed module-aware solution, making the motivation appear incomplete and insufficiently

Reviewer 02Rating 4Confidence 4

Strengths

1. The paper’s key strength lies in introducing a module-aware unlearning paradigm that aligns with the inherently modular Transformer architecture. By targeting attention heads and feed-forward filters, it effectively addresses the limitations of prior parameter-efficient methods (e.g., SA, SURE) that overlook modularity or focus on fine-grained pruning. 2. The method is theoretically grounded rather than heuristic. The learnable masks are derived from unlearning objectives (MLR, MLF) that inte

Weaknesses

1. As noted in Appendix D.5, the method underperforms full-parameter updates when the number of forget samples is small (e.g., TOFU Forget01). Since many real-world unlearning requests involve limited data, this limitation is particularly noteworthy. 2. The authors fix the sparsity level at 90 percent for most experiments but do not analyze how varying sparsity (e.g., 50%, 70%, 95%) affects performance across tasks, leaving scalability under different sparsity settings unexplored.

Reviewer 03Rating 2Confidence 4

Strengths

1. The method uniquely operates at the module level (heads and filters) rather than individual parameters, better capturing Transformer architecture patterns. 2. Comprehensive experiments across diverse tasks and models demonstrate superior effectiveness at 90% sparsity while maintaining model fidelity.

Weaknesses

1. While the method is empirically strong, the paper provides limited theoretical justification for why module-level unlearning should be fundamentally more effective than parameter-level approaches for Transformers. The connection between module structure and unlearning efficacy remains somewhat heuristic. 2. The paper lacks thorough ablation studies on mask initialization strategies and the warm-start greedy algorithm. It remains unclear how sensitive the results are to these design choices, o

Reviewer 04Rating 6Confidence 3

Strengths

1. The paper addresses an important problem: unlearning in large Transformer models. The idea of module-level selection is well-motivated by the Transformer structure. 2. The paper writing is clear. 3. The empirical evaluation is comprehensive. Extensive experiments on diverse tasks using different models demonstrate that the proposed method achieves strong forgetting while preserving utility.

Weaknesses

1. The approach assumes continued access to both the forget set and the retain set in order to compute gradients and Fisher/Hessian. This is a strong assumption. Large LLMs often cannot reconstruct a clean retain set. The paper does not clarify how MAPE-Unlearn applies when forget data $D_f$ or retain data $D_r$ is unavailable. 2. Although the method is called “parameter-efficient,” the mask computation itself is still expensive. This cost must be paid for each unlearning request. The pape

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.