Neural Gate: Mitigating Privacy Risks in LVLMs via Neuron-Level Gradient Gating
Xiangkui Cao, Jie Zhang, Meina Kan, Shiguang Shan, Xilin Chen

TL;DR
Neural Gate is a novel neuron-level editing technique that enhances privacy protection in LVLMs by increasing their refusal rate to privacy-related queries without degrading their overall performance.
Contribution
The paper introduces Neural Gate, a new method for neuron-level model editing that improves privacy safeguards in LVLMs, especially against unseen sensitive queries.
Findings
Significantly increases privacy-related query refusal rates.
Maintains model utility and performance on standard tasks.
Effective on MiniGPT and LLaVA models.
Abstract
Large Vision-Language Models (LVLMs) have shown remarkable potential across a wide array of vision-language tasks, leading to their adoption in critical domains such as finance and healthcare. However, their growing deployment also introduces significant security and privacy risks. Malicious actors could potentially exploit these models to extract sensitive information, highlighting a critical vulnerability. Recent studies show that LVLMs often fail to consistently refuse instructions designed to compromise user privacy. While existing work on privacy protection has made meaningful progress in preventing the leakage of sensitive data, they are constrained by limitations in both generalization and non-destructiveness. They often struggle to robustly handle unseen privacy-related queries and may inadvertently degrade a model's performance on standard tasks. To address these challenges, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning
