Single-Sample Black-Box Membership Inference Attack against Vision-Language Models via Cross-modal Semantic Alignment
Jiaqing Li, Yajuan Lu, Xiaochuan Shi, Gang Wu, ZhongYuan Wang, Chao Liang

TL;DR
This paper introduces a novel black-box membership inference attack on vision-language models that leverages cross-modal semantic alignment to determine if a sample was part of the training data, even in single-sample scenarios.
Contribution
The authors propose a new MIA framework based on cross-modal semantic alignment that works effectively in strict black-box and single-sample settings, surpassing existing methods.
Findings
Achieves an AUC of 0.821 on VL-MIA/Flicker dataset against LLaVA-1.5
Outperforms existing baselines significantly
Remains robust under diverse image perturbations
Abstract
Vision-Language Models (VLMs) have achieved remarkable success, yet their reliance on massive datasets and unintended memorization of training data raise significant data security risk. Membership Inference Attacks (MIAs) aim to assess these risks by determining whether a data sample was included in a model's training set. However, existing MIA methods against VLMs face critical bottlenecks: gray-box method relies on internal logits that are typically restricted in real-world Application Programming Interfaces (APIs), while black-box method depends on large-scale statistical distributions, which struggle in single-sample scenarios. To this end, we investigate MIAs from the perspective of cross-modal semantic alignment, and observe that member images exhibit significantly stronger image-caption alignment due to training memorization, whereas generated captions for non-members may deviate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
