Knowledge Vector Weakening: Efficient Training-free Unlearning for Large Vision-Language Models
Yejin Kim, Dongjun Hwang, Sungmin Cha, Junsuk Choe

TL;DR
This paper introduces Knowledge Vector Weakening (KVW), a training-free method for unlearning specific data in large vision-language models, reducing computational costs while maintaining effective forgetting of undesirable knowledge.
Contribution
KVW is a novel, training-free approach that directly weakens knowledge vectors in large models, offering an efficient alternative to gradient-based unlearning methods.
Findings
KVW achieves a stable forget-retain trade-off.
KVW significantly improves computational efficiency.
KVW outperforms gradient-based and LoRA-based methods.
Abstract
Large Vision-Language Models (LVLMs) are widely adopted for their strong multimodal capabilities, yet they raise serious concerns such as privacy leakage and harmful content generation. Machine unlearning has emerged as a promising solution for removing the influence of specific data from trained models. However, existing approaches largely rely on gradient-based optimization, incurring substantial computational costs for large-scale LVLMs. To address this limitation, we propose Knowledge Vector Weakening (KVW), a training-free unlearning method that directly intervenes in the full model without gradient computation. KVW identifies knowledge vectors that are activated during the model's output generation on the forget set and progressively weakens their contributions, thereby preventing the model from exploiting undesirable knowledge. Experiments on the MLLMU and CLEAR benchmarks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
