PerProb: Indirectly Evaluating Memorization in Large Language Models

Yihan Liao; Jacky Keung; Xiaoxue Ma; Jingyu Zhang; Yicheng Sun

arXiv:2512.14600·cs.CR·December 17, 2025

PerProb: Indirectly Evaluating Memorization in Large Language Models

Yihan Liao, Jacky Keung, Xiaoxue Ma, Jingyu Zhang, Yicheng Sun

PDF

Open Access

TL;DR

PerProb introduces a standardized, label-free framework for indirectly assessing memorization and privacy risks in large language models by analyzing perplexity differences, applicable across various models and settings.

Contribution

We propose PerProb, a novel, unified method for evaluating LLM memorization without relying on labels or internal model access, addressing limitations of prior MIAs.

Findings

01

PerProb effectively detects memorization across multiple datasets.

02

Mitigation strategies like differential privacy reduce data leakage.

03

Memory behaviors vary significantly among different LLMs.

Abstract

The rapid advancement of Large Language Models (LLMs) has been driven by extensive datasets that may contain sensitive information, raising serious privacy concerns. One notable threat is the Membership Inference Attack (MIA), where adversaries infer whether a specific sample was used in model training. However, the true impact of MIA on LLMs remains unclear due to inconsistent findings and the lack of standardized evaluation methods, further complicated by the undisclosed nature of many LLM training sets. To address these limitations, we propose PerProb, a unified, label-free framework for indirectly assessing LLM memorization vulnerabilities. PerProb evaluates changes in perplexity and average log probability between data generated by victim and adversary models, enabling an indirect estimation of training-induced memory. Compared with prior MIA methods that rely on member/non-member…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Hate Speech and Cyberbullying Detection