LUMIA: Linear probing for Unimodal and MultiModal Membership Inference   Attacks leveraging internal LLM states

Luis Ibanez-Lissen; Lorena Gonzalez-Manzano; Jose Maria de Fuentes,; Nicolas Anciaux; Joaquin Garcia-Alfaro

arXiv:2411.19876·cs.CR·January 13, 2025

LUMIA: Linear probing for Unimodal and MultiModal Membership Inference Attacks leveraging internal LLM states

Luis Ibanez-Lissen, Lorena Gonzalez-Manzano, Jose Maria de Fuentes,, Nicolas Anciaux, Joaquin Garcia-Alfaro

PDF

Open Access

TL;DR

LUMIA introduces a layer-by-layer linear probing method to detect membership inference attacks in large language models by analyzing internal activations, significantly improving detection accuracy across unimodal and multimodal tasks.

Contribution

The paper presents LUMIA, a novel linear probing approach that leverages internal LLM states for more effective membership inference attack detection, outperforming previous methods.

Findings

01

LUMIA achieves an average 15.71% AUC gain over prior techniques.

02

In 65.33% of cases, LUMIA reaches AUC > 60%.

03

In multimodal models, visual inputs enhance MIA detection, with 85.90% of experiments exceeding AUC > 60%.

Abstract

Large Language Models (LLMs) are increasingly used in a variety of applications, but concerns around membership inference have grown in parallel. Previous efforts focus on black-to-grey-box models, thus neglecting the potential benefit from internal LLM information. To address this, we propose the use of Linear Probes (LPs) as a method to detect Membership Inference Attacks (MIAs) by examining internal activations of LLMs. Our approach, dubbed LUMIA, applies LPs layer-by-layer to get fine-grained data on the model inner workings. We test this method across several model architectures, sizes and datasets, including unimodal and multimodal tasks. In unimodal MIA, LUMIA achieves an average gain of 15.71 % in Area Under the Curve (AUC) over previous techniques. Remarkably, LUMIA reaches AUC>60% in 65.33% of cases -- an increment of 46.80% against the state of the art. Furthermore, our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsFocus