Geometric-disentangelment Unlearning

Duo Zhou; Yuji Zhang; Tianxin Wei; Ruizhong Qiu; Ke Yang; Xiao Lin; Cheng Qian; Jingrui He; Hanghang Tong; Chengxiang Zhai; Heng Ji; Huan Zhang

arXiv:2511.17100·cs.LG·February 3, 2026

Geometric-disentangelment Unlearning

Duo Zhou, Yuji Zhang, Tianxin Wei, Ruizhong Qiu, Ke Yang, Xiao Lin, Cheng Qian, Jingrui He, Hanghang Tong, Chengxiang Zhai, Heng Ji, Huan Zhang

PDF

Open Access

TL;DR

This paper introduces Geometric-disentanglement Unlearning (GU), a theoretically grounded method that improves the balance between forgetting private or harmful content and retaining knowledge in large language models, with provable guarantees.

Contribution

It formalizes the 'no side effects' principle in LLM unlearning, and proposes a projection-based method that reduces forget-retain trade-offs with theoretical backing.

Findings

01

GU enhances forgetting effectiveness by up to 62% in Extraction Strength.

02

GU reduces collateral knowledge loss, improving retain ES by 31%.

03

Experiments on multiple datasets demonstrate GU's effectiveness and theoretical soundness.

Abstract

Large language models (LLMs) can internalize private or harmful content, motivating unlearning that removes a forget set while preserving retaining knowledge. However, forgetting updates often cause collateral degradation on retaining knowledge, creating a persistent trade-off. Existing LLM unlearning methods are often heuristic, and other theoretical approaches rely on offline feature constructions that do not capture update-time forget-retain interaction in LLMs. To address this limitation, we aim to develop an LLM unlearning method that reduces the forget-retain trade-off with theoretical guarantees. We take a first-principles view by formalizing "no side effects" as local retain invariance under small parameter updates, and prove an equivalence under optimizer-induced geometry: the retain loss is locally invariant if and only if the update direction is orthogonal to the subspace…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Advanced Graph Neural Networks · Adversarial Robustness in Machine Learning