DATABench: Evaluating Dataset Auditing in Deep Learning from an Adversarial Perspective
Shuo Shao, Yiming Li, Mengren Zheng, Zhiyang Hu, Yukun Chen, Boheng Li, Yu He, Junfeng Guo, Dacheng Tao, Zhan Qin

TL;DR
This paper evaluates the robustness of dataset auditing methods in deep learning against adversarial attacks, introduces a comprehensive benchmark, and highlights the need for more secure auditing techniques.
Contribution
It introduces DATABench, a new benchmark with attack strategies and evaluation of existing auditing methods under adversarial conditions.
Findings
Existing auditing methods lack robustness against adversarial attacks.
DATABench includes 17 evasion and 5 forgery attacks for comprehensive evaluation.
Current methods are insufficiently reliable in adversarial scenarios.
Abstract
The widespread application of Deep Learning across diverse domains hinges critically on the quality and composition of training datasets. However, the common lack of disclosure regarding their usage raises significant privacy and copyright concerns. Dataset auditing techniques, which aim to determine if a specific dataset was used to train a given suspicious model, provide promising solutions to addressing these transparency gaps. While prior work has developed various auditing methods, their resilience against dedicated adversarial attacks remains largely unexplored. To bridge the gap, this paper initiates a comprehensive study evaluating dataset auditing from an adversarial perspective. We start with introducing a novel taxonomy, classifying existing methods based on their reliance on internal features (IF) (inherent to the data) versus external features (EF) (artificially introduced…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Ethics and Social Impacts of AI · Explainable Artificial Intelligence (XAI)
