Towards a Re-evaluation of Data Forging Attacks in Practice
Mohamed Suliman, Anisa Halimi, Swanand Kadhe, Nathalie Baracaldo, Douglas Leith

TL;DR
This paper critically examines data forging attacks in machine learning, revealing practical detection limitations and theoretical complexities, and calls for re-evaluating the strength of existing attacks and further research into their effectiveness.
Contribution
It provides a practical and theoretical analysis of data forging attacks, highlighting their detectability and the difficulty of generating identical gradients within domain constraints.
Findings
Current attack methods are easily detectable due to gradient differences.
Infinite solutions exist theoretically for identical gradients with real-valued data.
Finding domain-constrained mini-batches with identical gradients is non-trivial.
Abstract
Data forging attacks provide counterfactual proof that a model was trained on a given dataset, when in fact, it was trained on another. These attacks work by forging (replacing) mini-batches with ones containing distinct training examples that produce nearly identical gradients. Data forging appears to break any potential avenues for data governance, as adversarial model owners may forge their training set from a dataset that is not compliant to one that is. Given these serious implications on data auditing and compliance, we critically analyse data forging from both a practical and theoretical point of view, finding that a key practical limitation of current attack methods makes them easily detectable by a verifier; namely that they cannot produce sufficiently identical gradients. Theoretically, we analyse the question of whether two distinct mini-batches can produce the same gradient.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital and Cyber Forensics · Advanced Malware Detection Techniques · Network Security and Intrusion Detection
