DATABench: Evaluating Dataset Auditing in Deep Learning from an Adversarial Perspective

Shuo Shao; Yiming Li; Mengren Zheng; Zhiyang Hu; Yukun Chen; Boheng Li; Yu He; Junfeng Guo; Dacheng Tao; Zhan Qin

arXiv:2507.05622·cs.CR·December 16, 2025

DATABench: Evaluating Dataset Auditing in Deep Learning from an Adversarial Perspective

Shuo Shao, Yiming Li, Mengren Zheng, Zhiyang Hu, Yukun Chen, Boheng Li, Yu He, Junfeng Guo, Dacheng Tao, Zhan Qin

PDF

Open Access

TL;DR

This paper evaluates the robustness of dataset auditing methods in deep learning against adversarial attacks, introduces a comprehensive benchmark, and highlights the need for more secure auditing techniques.

Contribution

It introduces DATABench, a new benchmark with attack strategies and evaluation of existing auditing methods under adversarial conditions.

Findings

01

Existing auditing methods lack robustness against adversarial attacks.

02

DATABench includes 17 evasion and 5 forgery attacks for comprehensive evaluation.

03

Current methods are insufficiently reliable in adversarial scenarios.

Abstract

The widespread application of Deep Learning across diverse domains hinges critically on the quality and composition of training datasets. However, the common lack of disclosure regarding their usage raises significant privacy and copyright concerns. Dataset auditing techniques, which aim to determine if a specific dataset was used to train a given suspicious model, provide promising solutions to addressing these transparency gaps. While prior work has developed various auditing methods, their resilience against dedicated adversarial attacks remains largely unexplored. To bridge the gap, this paper initiates a comprehensive study evaluating dataset auditing from an adversarial perspective. We start with introducing a novel taxonomy, classifying existing methods based on their reliance on internal features (IF) (inherent to the data) versus external features (EF) (artificially introduced…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Ethics and Social Impacts of AI · Explainable Artificial Intelligence (XAI)