Vid-SME: Membership Inference Attacks against Large Video Understanding Models

Qi Li; Runpeng Yu; Xinchao Wang

arXiv:2506.03179·cs.CV·June 5, 2025

Vid-SME: Membership Inference Attacks against Large Video Understanding Models

Qi Li, Runpeng Yu, Xinchao Wang

PDF

Open Access

TL;DR

This paper introduces Vid-SME, a novel membership inference attack tailored for large video understanding models, effectively identifying whether specific videos were part of the training data by analyzing model confidence and temporal variations.

Contribution

The paper presents the first video-specific membership inference method, Vid-SME, which leverages Sharma-Mittal entropy and temporal frame analysis to improve attack accuracy on video models.

Findings

01

Vid-SME achieves high true positive rates at low false positive rates.

02

It effectively captures temporal variations in videos for membership inference.

03

Experimental results show strong effectiveness on various models.

Abstract

Multimodal large language models (MLLMs) demonstrate remarkable capabilities in handling complex multimodal tasks and are increasingly adopted in video understanding applications. However, their rapid advancement raises serious data privacy concerns, particularly given the potential inclusion of sensitive video content, such as personal recordings and surveillance footage, in their training datasets. Determining improperly used videos during training remains a critical and unresolved challenge. Despite considerable progress on membership inference attacks (MIAs) for text and image data in MLLMs, existing methods fail to generalize effectively to the video domain. These methods suffer from poor scalability as more frames are sampled and generally achieve negligible true positive rates at low false positive rates (TPR@Low FPR), mainly due to their failure to capture the inherent temporal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning