Cross-Modal Unlearning via Influential Neuron Path Editing in Multimodal Large Language Models
Kunhao Li, Wenhao Li, Di Wu, Lei Yang, Jun Bai, Ju Jia, Jason Xue

TL;DR
This paper introduces MIP-Editor, a novel method for effective multimodal unlearning in large language models that selectively forget sensitive information across modalities while maintaining overall performance.
Contribution
It proposes a modality-specific attribution and influential neuron path editing approach to improve consistency and effectiveness of unlearning in multimodal models.
Findings
Achieves up to 87.75% forgetting rate on multimodal tasks.
Improves general knowledge retention by up to 54.26%.
Preserves 77.9% of performance on textual tasks.
Abstract
Multimodal Large Language Models (MLLMs) extend foundation models to real-world applications by integrating inputs such as text and vision. However, their broad knowledge capacity raises growing concerns about privacy leakage, toxicity mitigation, and intellectual property violations. Machine Unlearning (MU) offers a practical solution by selectively forgetting targeted knowledge while preserving overall model utility. When applied to MLLMs, existing neuron-editing-based MU approaches face two fundamental challenges: (1) forgetting becomes inconsistent across modalities because existing point-wise attribution methods fail to capture the structured, layer-by-layer information flow that connects different modalities; and (2) general knowledge performance declines when sensitive neurons that also support important reasoning paths are pruned, as this disrupts the model's ability to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Topic Modeling
