Trapping Attacker in Dilemma: Examining Internal Correlations and External Influences of Trigger for Defending GNN Backdoors
Fan Yang, Binyan Xu, Di Tang, Kehuan Zhang

TL;DR
PRAETORIAN is a novel defense for GNN backdoors that analyzes internal correlations and external influence to detect triggers, significantly reducing attack success rates while maintaining high accuracy.
Contribution
It introduces a new defense method that targets intrinsic properties of GNN backdoors, outperforming existing defenses against adaptive attacks.
Findings
Reduces average attack success rate to 0.55%
Maintains only 0.62% drop in clean accuracy
Effective against various adaptive attack strategies
Abstract
GNNs have become a standard tool for learning on relational data, yet they remain highly vulnerable to backdoor attacks. Prior defenses often depend on inspecting specific subgraph patterns or node features, and thus can be circumvented by adaptive attackers. We propose PRAETORIAN, a new defense that targets intrinsic requirements of effective GNN backdoors rather than surface-level cues. Our key observation is that flipping a victim node's prediction requires substantial influence on the victim: attackers tend to either inject many trigger nodes or rely on a small set of highly influential ones. Building on this observation, PRAETORIAN (i) analyzes internal correlations within potential trigger subgraphs to detect abnormally large injected structures, and (ii) quantifies external node influence to identify triggers with disproportionate impact. Across our evaluations, PRAETORIAN…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
