Backdoor Secrets Unveiled: Identifying Backdoor Data with Optimized Scaled Prediction Consistency
Soumyadeep Pal, Yuguang Yao, Ren Wang, Bingquan Shen, Sijia Liu

TL;DR
This paper introduces a novel method for automatically identifying backdoor data within poisoned datasets using an optimized scaled prediction consistency approach, without needing clean data or manual thresholds.
Contribution
It develops a bi-level optimization framework leveraging SPC for backdoor data detection, addressing limitations of previous methods and handling various attack types.
Findings
Achieves 4%-36% improvement in AUROC over baselines
Effective against both label-corrupted and clean-label backdoor attacks
Operates without access to clean data or manual thresholds
Abstract
Modern machine learning (ML) systems demand substantial training data, often resorting to external sources. Nevertheless, this practice renders them vulnerable to backdoor poisoning attacks. Prior backdoor defense strategies have primarily focused on the identification of backdoored models or poisoned data characteristics, typically operating under the assumption of access to clean data. In this work, we delve into a relatively underexplored challenge: the automatic identification of backdoor data within a poisoned dataset, all under realistic conditions, i.e., without the need for additional clean data or without manually defining a threshold for backdoor detection. We draw an inspiration from the scaled prediction consistency (SPC) technique, which exploits the prediction invariance of poisoned data to an input scaling factor. Based on this, we pose the backdoor data identification…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDigital and Cyber Forensics · Advanced Malware Detection Techniques · Network Security and Intrusion Detection
