Maximizing Local Entropy Where It Matters: Prefix-Aware Localized LLM Unlearning

Naixin Zhai; Pengyang Shao; Binbin Zheng; Yonghui Yang; Fei Shen; Long Bai; Xun Yang

arXiv:2601.03190·cs.CL·April 21, 2026

Maximizing Local Entropy Where It Matters: Prefix-Aware Localized LLM Unlearning

Naixin Zhai, Pengyang Shao, Binbin Zheng, Yonghui Yang, Fei Shen, Long Bai, Xun Yang

PDF

1 Repo

TL;DR

PALU introduces a prefix-aware, localized unlearning method for LLMs that maximizes entropy only where necessary, effectively forgetting sensitive info while preserving utility.

Contribution

The paper proposes PALU, a novel framework that localizes unlearning to critical prefixes and logits, reducing utility loss and improving forgetting efficiency.

Findings

01

PALU effectively forgets sensitive prefixes without degrading overall performance.

02

Flattening top-k logits suffices for uncertainty in critical subspaces.

03

PALU outperforms existing methods in forgetting efficacy and utility preservation.

Abstract

Machine unlearning aims to forget sensitive knowledge from Large Language Models (LLMs) while maintaining general utility. However, existing approaches typically treat all tokens in a response indiscriminately and enforce uncertainty over the entire vocabulary. This global treatment results in unnecessary utility degradation and extends optimization to content-agnostic regions. To address these limitations, we propose PALU (Prefix-Aware Localized Unlearning), a framework driven by a local entropy maximization objective across both temporal and vocabulary dimensions. PALU reveals that (i) suppressing the sensitive prefix alone is sufficient to sever the causal generation link, and (ii) flattening only the top- $k$ logits is adequate to maximize uncertainty in the critical subspace. These findings allow PALU to alleviate redundant optimization across the full vocabulary and parameter space…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nxZhai/PALU
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.