Per-parameter Task Arithmetic for Unlearning in Large Language Models

Chengyi Cai; Zesheng Ye; Jiangchao Yao; Jianzhong Qi; Bo Han; Xiaolu Zhang; Feng Liu; Jun Zhou

arXiv:2601.22030·cs.LG·January 30, 2026

Per-parameter Task Arithmetic for Unlearning in Large Language Models

Chengyi Cai, Zesheng Ye, Jiangchao Yao, Jianzhong Qi, Bo Han, Xiaolu Zhang, Feng Liu, Jun Zhou

PDF

Open Access

TL;DR

This paper introduces PerTA, a per-parameter adjustment method for task arithmetic in large language model unlearning, improving privacy information removal while preserving model utility.

Contribution

It proposes a novel per-parameter rescaling mechanism for task arithmetic, enhancing unlearning effectiveness and reducing over-forgetting in large language models.

Findings

01

PerTA outperforms standard task vectors in unlearning tasks.

02

PerTA surpasses training-based unlearning methods in effectiveness.

03

The method maintains model utility while improving privacy removal.

Abstract

In large language model (LLM) unlearning, private information is required to be removed. Task arithmetic unlearns by subtracting a specific task vector (TV)--defined as the parameter difference between a privacy-information-tuned model and the original model. While efficient, it can cause over-forgetting by disrupting parameters essential for retaining other information. Motivated by the observation that each parameter exhibits different importance for forgetting versus retention, we propose a per-parameter task arithmetic (PerTA) mechanism to rescale the TV, allowing per-parameter adjustment. These weights quantify the relative importance of each parameter for forgetting versus retention, estimated via gradients (i.e., PerTA-grad) or the diagonal Fisher information approximation (i.e., PerTA-fisher). Moreover, we discuss the effectiveness of PerTA, extend it to a more general form, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Domain Adaptation and Few-Shot Learning · Topic Modeling