UniErase: Towards Balanced and Precise Unlearning in Language Models

Miao Yu; Liang Lin; Guibin Zhang; Xinfeng Li; Junfeng Fang; Xingrui Yu; Ivor Tsang; Ningyu Zhang; Kun Wang; Yang Wang

arXiv:2505.15674·cs.CL·September 29, 2025

UniErase: Towards Balanced and Precise Unlearning in Language Models

Miao Yu, Liang Lin, Guibin Zhang, Xinfeng Li, Junfeng Fang, Xingrui Yu, Ivor Tsang, Ningyu Zhang, Kun Wang, Yang Wang

PDF

1 Repo 4 Reviews

TL;DR

UniErase introduces a novel, precise, and balanced unlearning framework for large language models, effectively removing outdated knowledge while retaining overall model ability, outperforming existing methods across various benchmarks.

Contribution

The paper proposes UniErase, a new editing-based unlearning paradigm using Unlearning Tokens and Edits to achieve precise, balanced unlearning and ability retention in LLMs.

Findings

01

Outperforms 8 baselines on TOFU benchmark.

02

Modifies only 3.66% of parameters for effective unlearning.

03

Achieves 4.01× better model ability retention than previous methods.

Abstract

Large language models (LLMs) require iterative updates to address the outdated information problem, where LLM unlearning offers an approach for selective removal. However, mainstream unlearning methods primarily rely on fine-tuning techniques, which often lack precision in targeted unlearning and struggle to balance unlearning efficacy with general ability under massive and sequential settings. To bridge this gap, in this work, we introduce UniErase, a novel unlearning framework that demonstrates precision and balanced performances between knowledge unlearning and ability retaining. We first propose the Unlearning Token, which is optimized to steer LLMs toward a forgetting space. To achieve concrete unlearning behaviors, we further introduce the lightweight Unlearning Edit to efficiently associate the unlearning targets with this meta-token. Serving as a new unlearning paradigm via…

Peer Reviews

Decision·ICLR 2026 Conference Withdrawn Submission

Reviewer 01Rating 4Confidence 4

Strengths

**S1**. The paper clearly defines the Unlearning Logical Chain and derives closed-form parameter updates (Eq. 11, 13), bridging intuitive motivation and formalism. **S2**. The evaluations include batch, sequential, and precise unlearning, with both synthetic (TOFU) and real (RETURN) data. **S3**. The proposed method can be very efficient as it updates only a fraction of the parameters.

Weaknesses

**W1**. The claim that UniErase “pioneers” the modeling of LLM unlearning as a knowledge editing problem is overstated. For example, see [1]. **W2**. The notation is a bit sloppy, e.g., in Eq. 7, $a’$ was previously used for &D\D_f&, and the frequent use of $a$ and $\alpha$ can be confusing. **W3**. The presentation of the paper can be improved. For example, the methodology is explained through a few steps in Sec. 4.1 – 4.3, but each is very verbose and includes notational details that can b

Reviewer 02Rating 2Confidence 4

Strengths

- Breaks through the limitations of mainstream fine-tuning-based unlearning methods, proposing UniErase—a new unlearning paradigm that models LLM unlearning as a knowledge editing problem. By directly modifying model parameters instead of multi-round fine-tuning, it expands the research scope of the unlearning field and provides a new direction for subsequent studies. - Achieves dual-high performance in unlearning efficacy and general ability retention. With only ~3.66% of LLM parameters modifie

Weaknesses

- This study lacks model diversity, as experiments were only conducted on two models from the LLaMA series, failing to provide verification of generalization. - The expression in the paper is not clear enough, and the same issue applies to the presentation of figures. For instance, Figures 1, 2, 3, 4, and 6 almost all have overlaps between text and graphics—even text from subfigures overlapping with other subfigures. This clearly fails to meet the publication requirements for academic papers. -

Reviewer 03Rating 2Confidence 5

Strengths

The paper presents a clear problem formulation and reports improved unlearning performance over benchmarked methods, while maintaining better balance in sequential unlearning scenarios.

Weaknesses

The novelty of UniErase is limited, its core model editing component closely mirrors prior work, offering little methodological advancement. The benchmarks used are outdated, with stronger recent unlearning methods omitted, making the claimed advantage unconvincing. Moreover, UniErase underperforms on MMLU, HumanEval, and GSM-8K. The reported evaluation metric is a cocktail of multiple scores, a questionable approach on reporting performance.

Reviewer 04Rating 2Confidence 5

Strengths

This paper tackles the important problem of knowledge unlearning with a clear conceptual division between the meta unlearning token and the editing process. The meta unlearning token serves as a well-defined objective for model editing, and although the closed-form derivation is not new, it provides a transparent analytical means for implementing the edit.

Weaknesses

- UniErase borrows heavily from the editing playbook and applying such editing paradigms to unlearning has been previously explored, so the contribution feels more like an engineering consolidation than a conceptual advance. The key difference appears to lie in defining a shared target object, [UNL], a construct that was discussed in previous paper [1]. Yet the paper does not sufficiently clarify why this intermediate meta token is necessary or preferable compared to directly editing toward an “

Code & Models

Repositories

Ymm-cll/UniErase
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTofu