Constrained Entropic Unlearning: A Primal-Dual Framework for Large Language Models

Taha Entesari; Arman Hatami; Rinat Khaziev; Anil Ramakrishna; Mahyar Fazlyab

arXiv:2506.05314·cs.CL·October 28, 2025

Constrained Entropic Unlearning: A Primal-Dual Framework for Large Language Models

Taha Entesari, Arman Hatami, Rinat Khaziev, Anil Ramakrishna, Mahyar Fazlyab

PDF

1 Video

TL;DR

This paper introduces a novel constrained optimization framework for unlearning in large language models, improving stability and effectiveness in removing specific information while maintaining model utility.

Contribution

It formulates LLM unlearning as a constrained problem with a new logit-margin flattening loss and solves it using a scalable primal-dual algorithm, outperforming existing methods.

Findings

01

Effectively removes targeted information from LLMs.

02

Maintains model utility and performance on retained data.

03

Demonstrates superior results on TOFU and MUSE benchmarks.

Abstract

Large Language Models (LLMs) deployed in real-world settings increasingly face the need to unlearn sensitive, outdated, or proprietary information. Existing unlearning methods typically formulate forgetting and retention as a regularized trade-off, combining both objectives into a single scalarized loss. This often leads to unstable optimization and degraded performance on retained data, especially under aggressive forgetting. We propose a new formulation of LLM unlearning as a constrained optimization problem: forgetting is enforced via a novel logit-margin flattening loss that explicitly drives the output distribution toward uniformity on a designated forget set, while retention is preserved through a hard constraint on a separate retain set. Compared to entropy-based objectives, our loss is softmax-free, numerically stable, and maintains non-vanishing gradients, enabling more…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Constrained Entropic Unlearning: A Primal-Dual Framework for Large Language Models· slideslive

Taxonomy

MethodsTofu