SEPS: A Separability Measure for Robust Unlearning in LLMs

Wonje Jeung; Sangyeon Yoon; Albert No

arXiv:2505.14832·cs.CL·May 28, 2025

SEPS: A Separability Measure for Robust Unlearning in LLMs

Wonje Jeung, Sangyeon Yoon, Albert No

PDF

Open Access 1 Video

TL;DR

This paper introduces SEPS, a new evaluation framework for unlearning in LLMs that assesses the model's ability to forget and retain information simultaneously within mixed prompts, addressing limitations of existing metrics.

Contribution

The paper proposes SEPS and Mixed Prompt unlearning, a novel training strategy that enhances robustness of unlearning in LLMs in complex, mixed-query scenarios.

Findings

01

SEPS effectively measures mixed-query unlearning performance.

02

Mixed Prompt unlearning improves robustness in multi-query settings.

03

Existing methods often overfit or erase too much information.

Abstract

Machine unlearning aims to selectively remove targeted knowledge from Large Language Models (LLMs), ensuring they forget specified content while retaining essential information. Existing unlearning metrics assess whether a model correctly answers retain queries and rejects forget queries, but they fail to capture real-world scenarios where forget queries rarely appear in isolation. In fact, forget and retain queries often coexist within the same prompt, making mixed-query evaluation crucial. We introduce SEPS, an evaluation framework that explicitly measures a model's ability to both forget and retain information within a single prompt. Through extensive experiments across three benchmarks, we identify two key failure modes in existing unlearning methods: (1) untargeted unlearning indiscriminately erases both forget and retain content once a forget query appears, and (2) targeted…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

SEPS: A Separability Measure for Robust Unlearning in LLMs· underline

Taxonomy

TopicsDiverse Research and Applications · Higher Education Learning Practices