Agents Are All You Need for LLM Unlearning
Debdeep Sanyal, Murari Mandal

TL;DR
This paper introduces ALU, a multi-agent, retrain-free, model-agnostic framework for effective and practical inference-time unlearning in large language models, outperforming existing methods in robustness and scalability.
Contribution
The paper presents the first agentic LLM unlearning approach that operates without model updates, enabling real-time, scalable, and robust information removal.
Findings
ALU outperforms state-of-the-art unlearning methods in robustness.
ALU maintains constant time cost regardless of unlearning target number.
ALU successfully unlearns up to 1000 instances, exceeding previous evaluation scales.
Abstract
Information removal or suppression in large language models (LLMs) is a desired functionality, useful in AI regulation, legal compliance, safety, and privacy. LLM unlearning methods aim to remove information on demand from LLMs. Current LLM unlearning methods struggle to balance the unlearning efficacy and utility due to the competing nature of these objectives. Keeping the unlearning process computationally feasible without assuming access to the model weights is an overlooked area. In this work we show that \textit{agents might be all we need for effective and practical inference-time LLM unlearning}. We present the first agentic LLM unlearning (\texttt{ALU}) method, a multi-agent, retrain-free, model-agnostic approach to LLM unlearning that achieves effective unlearning while preserving the utility. Our \texttt{ALU} framework unlearns by involving multiple LLM agents, each designed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Agent Systems and Negotiation
MethodsSparse Evolutionary Training
