Tab-MIA: A Benchmark Dataset for Membership Inference Attacks on Tabular Data in LLMs

Eyal German; Sagiv Antebi; Daniel Samira; Asaf Shabtai; Yuval Elovici

arXiv:2507.17259·cs.CR·July 24, 2025

Tab-MIA: A Benchmark Dataset for Membership Inference Attacks on Tabular Data in LLMs

Eyal German, Sagiv Antebi, Daniel Samira, Asaf Shabtai, Yuval Elovici

PDF

Open Access 3 Reviews

TL;DR

This paper introduces Tab-MIA, a benchmark dataset for evaluating membership inference attacks on large language models trained on tabular data, revealing high vulnerability even after limited fine-tuning.

Contribution

It presents the first benchmark dataset and evaluation framework for MIAs on LLMs with tabular data, highlighting format-dependent memorization and privacy risks.

Findings

01

LLMs memorize tabular data variably across encoding formats.

02

Models fine-tuned for few epochs are highly vulnerable to MIAs.

03

High AUROC scores (~90%) indicate significant privacy risks.

Abstract

Large language models (LLMs) are increasingly trained on tabular data, which, unlike unstructured text, often contains personally identifiable information (PII) in a highly structured and explicit format. As a result, privacy risks arise, since sensitive records can be inadvertently retained by the model and exposed through data extraction or membership inference attacks (MIAs). While existing MIA methods primarily target textual content, their efficacy and threat implications may differ when applied to structured data, due to its limited content, diverse data types, unique value distributions, and column-level semantics. In this paper, we present Tab-MIA, a benchmark dataset for evaluating MIAs on tabular data in LLMs and demonstrate how it can be used. Tab-MIA comprises five data collections, each represented in six different encoding formats. Using our Tab-MIA benchmark, we conduct…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 0Confidence 4

Strengths

- Running three exisitng MIAs (LOSS attack, Min-K% attack, Min-K%++ ) on LLaMA-3.1 8B, LLaMA-3.2 3B, Gemma-3 4B, and Mistral 7B QLoRA fine-tuned on Tab-MIA dataset. - Highlighting that Tabular data may contain personally identifiable information (PII), commercially sensitive material, or domainspecific details that are not intended for broad dissemination

Weaknesses

- Lack Dataset novelty/validity: Tab-MIA is a recombination of existing tabular datasets. The construction also risks the member vs. non-member boundary, as there are no guarantees that chosen LLMs have not already seen all data. - No methodological novelty: No new membership-inference attack tailored to tabular data is proposed. - Results lack novelty: The paper largely reiterates established findings, nothing new about tabular data: - LLMs can memorize tabular data - partial transferabilit

Reviewer 02Rating 4Confidence 4

Strengths

* The work fills an important gap in auditing LLMs for privacy risks, beyond conventional text membership tests. * Strong evaluation on a suite of LLMs under a fine-tuning paradigm, as well as evaluation of some public pretrained models.

Weaknesses

* I wonder about the realism of fine-tuning a model on some context-free tables- in most natural language tasks a table will be associated with accompanying text to set context, provide explanation, inject a user query, etc. From this perspective I'm not sure how to interpret the significance of the attack results. * Experiments on larger models would be valuable, even just targeting large pretrained models without fine-tuning. * An experiment demonstrating a defense (eg training with DP-SGD

Reviewer 03Rating 2Confidence 4

Strengths

- The paper is easy to follow and presents a useful and timely contribution, introducing the benchmark for MIAs on tabular data in LLMs. - Provides valuable empirical insights into memorization patterns and attack transferability, highlighting underexplored privacy risks in adapting tabular data to LLM. - The methodology and benchmark design have practical utility for future privacy research on structured data.

Weaknesses

- The bullet points in lines 257–269 repeat the content of Figure 1; consider moving these details to the appendix for brevity. - Consider including a visual example contrasting long- vs short-context table encodings to help readers intuitively understand the setup. - The paper relies heavily on MIA metrics but provides limited explanation of them. Expanding the description of attack metrics in the main text would make the results more interpretable. - Consider reordering the results section,

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Data Quality and Management