Filter, Obstruct and Dilute: Defending Against Backdoor Attacks on   Semi-Supervised Learning

Xinrui Wang; Chuanxing Geng; Wenhai Wan; Shao-yuan Li; Songcan Chen

arXiv:2502.05755·cs.LG·February 11, 2025

Filter, Obstruct and Dilute: Defending Against Backdoor Attacks on Semi-Supervised Learning

Xinrui Wang, Chuanxing Geng, Wenhai Wan, Shao-yuan Li, Songcan Chen

PDF

Open Access

TL;DR

This paper introduces Backdoor Invalidator (BI), a method to defend semi-supervised learning models against backdoor attacks by filtering, obstructing, and diluting malicious influences, significantly reducing attack success rates without harming clean data accuracy.

Contribution

The paper presents a novel defense method for SSL backdoor attacks that combines filtering, complementary learning, and trigger mix-up, with theoretical guarantees of effectiveness.

Findings

01

Reduces attack success rate from 84.7% to 1.8%

02

Maintains accuracy on clean data

03

Provides theoretical generalization guarantees

Abstract

Recent studies have verified that semi-supervised learning (SSL) is vulnerable to data poisoning backdoor attacks. Even a tiny fraction of contaminated training data is sufficient for adversaries to manipulate up to 90\% of the test outputs in existing SSL methods. Given the emerging threat of backdoor attacks designed for SSL, this work aims to protect SSL against such risks, marking it as one of the few known efforts in this area. Specifically, we begin by identifying that the spurious correlations between the backdoor triggers and the target class implanted by adversaries are the primary cause of manipulated model predictions during the test phase. To disrupt these correlations, we utilize three key techniques: Gaussian Filter, complementary learning and trigger mix-up, which collectively filter, obstruct and dilute the influence of backdoor attacks in both data pre-processing and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Network Security and Intrusion Detection · Anomaly Detection Techniques and Applications