SineProject: Machine Unlearning for Stable Vision Language Alignment
Arpit Garg, Hemanth Saratchandran, Simon Lucey

TL;DR
SineProject is a novel method that stabilizes vision language alignment during machine unlearning by enhancing the Jacobian conditioning of the projector network, enabling effective forgetting of specific knowledge without disrupting model performance.
Contribution
It introduces sinusoidally modulated trainable parameters to the projector, significantly improving unlearning stability and preserving alignment in multimodal models.
Findings
Reduces benign query refusals during unlearning
Achieves complete forgetting of targeted information
Maintains state-of-the-art forget-retain trade-offs with minimal overhead
Abstract
Multimodal Large Language Models (MLLMs) increasingly need to forget specific knowledge such as unsafe or private information without requiring full retraining. However, existing unlearning methods often disrupt vision language alignment, causing models to reject both harmful and benign queries. We trace this failure to the projector network during unlearning, its Jacobian becomes severely illconditioned, leading to unstable optimization and drift in cross modal embeddings. We introduce SineProject, a simple method that augments the frozen projector with sinusoidally modulated trainable parameters, improving the Jacobian's spectral conditioning and stabilizing alignment throughout unlearning. Across standard safety and privacy unlearning benchmarks using LLaVA v1.5 7B and 13B, SineProject reduces benign query refusals while achieving complete forgetting of targeted information, yielding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Domain Adaptation and Few-Shot Learning
