LeakyCLIP: Extracting Training Data from CLIP

Yunhao Chen; Shujie Wang; Xin Wang; Ran He; Xingjun Ma; Yu-Gang Jiang

arXiv:2508.00756·cs.CR·May 22, 2026

LeakyCLIP: Extracting Training Data from CLIP

Yunhao Chen, Shujie Wang, Xin Wang, Ran He, Xingjun Ma, Yu-Gang Jiang

PDF

1 Repo

TL;DR

LeakyCLIP presents a new attack framework for reconstructing training images from CLIP embeddings, revealing significant privacy leakage risks in multimodal models.

Contribution

Introduces LeakyCLIP, a novel CLIP inversion method that improves image reconstruction quality and uncovers privacy risks in training data.

Findings

01

Achieves over 258% improvement in SSIM over baseline methods.

02

Demonstrates successful inference of training data membership from low-fidelity reconstructions.

03

Reveals pervasive privacy leakage risks in CLIP models.

Abstract

Understanding the memorization and privacy leakage risks in Contrastive Language--Image Pretraining (CLIP) is critical for ensuring the security of multimodal models. Recent studies have demonstrated the feasibility of extracting sensitive training examples from diffusion models, with conditional diffusion models exhibiting a stronger tendency to memorize and leak information. In this work, we investigate data memorization and extraction risks in CLIP through the lens of CLIP inversion, a process that aims to reconstruct training images from text prompts. To this end, we introduce \textbf{LeakyCLIP}, a novel attack framework designed to achieve high-quality, semantically accurate image reconstruction from CLIP embeddings. We identify three key challenges in CLIP inversion: 1) non-robust features, 2) limited visual semantics in text embeddings, and 3) low reconstruction fidelity. To…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dongdongunique/LeakyCLIP
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.