Unlearning as multi-task optimization: A normalized gradient difference   approach with an adaptive learning rate

Zhiqi Bu; Xiaomeng Jin; Bhanukiran Vinzamuri; Anil Ramakrishna,; Kai-Wei Chang; Volkan Cevher; Mingyi Hong

arXiv:2410.22086·cs.LG·May 7, 2025

Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate

Zhiqi Bu, Xiaomeng Jin, Bhanukiran Vinzamuri, Anil Ramakrishna,, Kai-Wei Chang, Volkan Cevher, Mingyi Hong

PDF

Open Access 1 Video

TL;DR

This paper introduces NGDiff, an optimization-based approach for machine unlearning that balances forgetting and performance objectives using a normalized gradient difference method with an adaptive learning rate, showing superior results.

Contribution

The paper proposes a novel NGDiff algorithm for machine unlearning, framing it as a multi-task optimization problem with an automatic learning rate scheduler, backed by theoretical analysis.

Findings

01

NGDiff outperforms state-of-the-art unlearning methods on TOFU and MUSE datasets.

02

NGDiff demonstrates stable training and better control over trade-offs.

03

The approach provides a theoretical foundation for optimization-based unlearning.

Abstract

Machine unlearning has been used to remove unwanted knowledge acquired by large language models (LLMs). In this paper, we examine machine unlearning from an optimization perspective, framing it as a regularized multi-task optimization problem, where one task optimizes a forgetting objective and another optimizes the model performance. In particular, we introduce a normalized gradient difference (NGDiff) algorithm, enabling us to have better control over the trade-off between the objectives, while integrating a new, automatic learning rate scheduler. We provide a theoretical analysis and empirically demonstrate the superior performance of NGDiff among state-of-the-art unlearning methods on the TOFU and MUSE datasets while exhibiting stable training.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate· underline

Taxonomy

TopicsNeural Networks and Applications

MethodsTofu