Revisiting the Auxiliary Data in Backdoor Purification
Shaokui Wei, Shanchao Yang, Jiayin Liu, Hongyuan Zha

TL;DR
This paper evaluates the dependence of backdoor purification techniques on auxiliary dataset quality and introduces Guided Input Calibration (GIC), a learnable transformation method that improves purification effectiveness across various auxiliary data types.
Contribution
The study reveals the impact of auxiliary dataset quality on purification success and proposes GIC, a novel method that enhances purification by aligning auxiliary data with the training set.
Findings
High-quality in-distribution datasets are crucial for effective purification.
Out-of-distribution auxiliary datasets significantly reduce purification performance.
GIC improves purification results across diverse auxiliary datasets.
Abstract
Backdoor attacks occur when an attacker subtly manipulates machine learning models during the training phase, leading to unintended behaviors when specific triggers are present. To mitigate such emerging threats, a prevalent strategy is to cleanse the victim models by various backdoor purification techniques. Despite notable achievements, current state-of-the-art (SOTA) backdoor purification techniques usually rely on the availability of a small clean dataset, often referred to as auxiliary dataset. However, acquiring an ideal auxiliary dataset poses significant challenges in real-world applications. This study begins by assessing the SOTA backdoor purification techniques across different types of real-world auxiliary datasets. Our findings indicate that the purification effectiveness fluctuates significantly depending on the type of auxiliary dataset used. Specifically, a high-quality…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Network Security and Intrusion Detection
