Lost in Modality: Evaluating the Effectiveness of Text-Based Membership Inference Attacks on Large Multimodal Models

Ziyi Tong; Feifei Sun; Le Minh Nguyen

arXiv:2512.03121·cs.CR·May 22, 2026

Lost in Modality: Evaluating the Effectiveness of Text-Based Membership Inference Attacks on Large Multimodal Models

Ziyi Tong, Feifei Sun, Le Minh Nguyen

PDF

TL;DR

This paper evaluates the effectiveness of text-based membership inference attacks on large multimodal models, revealing that visual inputs can mask membership signals in out-of-distribution scenarios.

Contribution

First comprehensive assessment of text-based MIAs on multimodal models, comparing vision-and-text and text-only conditions across multiple model families.

Findings

01

Logit-based MIAs perform similarly in in-distribution settings with a slight advantage for multimodal inputs.

02

Visual inputs act as regularizers, reducing membership inference effectiveness in out-of-distribution scenarios.

Abstract

Large Multimodal Language Models (MLLMs) are emerging as one of the foundational tools in an expanding range of applications. Consequently, understanding training-data leakage in these systems is increasingly critical. Log-probability-based membership inference attacks (MIAs) have become a widely adopted approach for assessing data exposure in large language models (LLMs), yet their effect in MLLMs remains unclear. We present the first comprehensive evaluation of extending these text-based MIA methods to multimodal settings. Our experiments under vision-and-text (V+T) and text-only (T-only) conditions across the DeepSeek-VL and InternVL model families show that in in-distribution settings, logit-based MIAs perform comparably across configurations, with a slight V+T advantage. Conversely, in out-of-distribution settings, visual inputs act as regularizers, effectively masking membership…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Multimodal Machine Learning Applications