Evading Data Provenance in Deep Neural Networks

Hongyu Zhu; Sichu Liang; Wenwen Wang; Zhuomeng Zhang; Fangqi Li; Shi-Lin Wang

arXiv:2508.01074·cs.CV·August 5, 2025

Evading Data Provenance in Deep Neural Networks

Hongyu Zhu, Sichu Liang, Wenwen Wang, Zhuomeng Zhang, Fangqi Li, Shi-Lin Wang

PDF

Open Access

TL;DR

This paper presents a novel evasion framework that effectively bypasses Dataset Ownership Verification methods in deep neural networks by transferring task-relevant knowledge while removing identifiers, exposing vulnerabilities in current protections.

Contribution

It introduces a unified evasion approach using teacher-student models and large language models to improve evasion success against DOV, revealing critical weaknesses in existing methods.

Findings

01

Our method outperforms nine state-of-the-art evasion attacks.

02

It successfully removes copyright identifiers from models.

03

Experiments demonstrate significant evasion effectiveness across diverse datasets.

Abstract

Modern over-parameterized deep models are highly data-dependent, with large scale general-purpose and domain-specific datasets serving as the bedrock for rapid advancements. However, many datasets are proprietary or contain sensitive information, making unrestricted model training problematic. In the open world where data thefts cannot be fully prevented, Dataset Ownership Verification (DOV) has emerged as a promising method to protect copyright by detecting unauthorized model training and tracing illicit activities. Due to its diversity and superior stealth, evading DOV is considered extremely challenging. However, this paper identifies that previous studies have relied on oversimplistic evasion attacks for evaluation, leading to a false sense of security. We introduce a unified evasion framework, in which a teacher model first learns from the copyright dataset and then transfers…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Explainable Artificial Intelligence (XAI)