3S-Attack: Spatial, Spectral and Semantic Invisible Backdoor Attack Against DNN Models
Jianyao Yin, Luca Arnaboldi, Honglong Chen, Pascal Berrang, Mark Ryan

TL;DR
The paper introduces 3S-attack, a novel backdoor method exploiting spatial, spectral, and semantic features to create stealthy, hard-to-detect manipulations in DNN models, exposing vulnerabilities at the intersection of robustness and interpretability.
Contribution
It presents a new multi-domain backdoor attack leveraging semantic features as triggers, combining spectral embedding and spatial restrictions for enhanced stealthiness.
Findings
3S-attack is highly stealthy against existing defenses.
The attack maintains high accuracy on benign inputs.
It reveals vulnerabilities related to semantic interpretability and robustness.
Abstract
Backdoor attacks implant hidden behaviors into models by poisoning training data or modifying the model directly. These attacks aim to maintain high accuracy on benign inputs while causing misclassification when a specific trigger is present. While existing studies have explored stealthy triggers in spatial and spectral domains, few incorporate the semantic domain. In this paper, we propose 3S-attack, a novel backdoor attack which is stealthy across the spatial, spectral, and semantic domains. The key idea is to exploit the semantic features of benign samples as triggers, using Gradient-weighted Class Activation Mapping (Grad-CAM) and a preliminary model for extraction. Then we embedded the trigger in the spectral domain, followed by pixel-level restrictions in the spatial domain. This process minimizes the distance between poisoned and benign samples, making the attack harder to detect…
Peer Reviews
Decision·Submitted to ICLR 2026
**Comprehensive multi-domain stealth design** * The attack unifies spatial, spectral, and semantic concealment, which no prior method achieves simultaneously. * This cross-domain formulation exposes new security blind spots where standard single-domain defenses fail (Sec. 4.4). * The modular design (Grad-CAM → DCT → pixel restriction) makes the idea easily reproducible and adaptable. **Novel use of Grad-CAM for trigger extraction** * Grad-CAM is used not for defense but to identify salient se
**Limited theoretical rigor in frequency-domain reasoning** * The choice of the “Frequency Selection Threshold” (Sec. 3.3) is heuristic; no explicit formula or derivation links magnitude difference and model sensitivity. * There is no analysis of how DCT component manipulation affects semantic embeddings or classification confidence (no equations in Sec. 3.3–3.5). **Ambiguity in semantic transferability across models** * The method assumes the saliency from a surrogate model approximates that
1. Novelty — first black-box semantic-stealth attack: Demonstrates the first attack that achieves semantic stealth without access to the victim model or training pipeline, filling a notable gap in threat modeling. 2. Rigorous multi-axis evaluation: Empirically validates stealth across semantic, spatial, and spectral defenses, showing the attack’s robustness against diverse defense paradigms.
1. Surrogate data dependence: The attack's reliance on a clean surrogate model and its sensitivity to distributional mismatch remain unexplored. Labeling ambiguity: The labeling strategy for poisoned samples is unclear and inconsistent with the stated attack objective. 2. Incomplete reporting: Benign accuracy (BA) and post-defense results are missing for some datasets, weakening claims of minimal performance drop. 3. Labeling ambiguity: The labeling strategy for poisoned samples is unclear and i
The paper proposes a unified backdoor attack that simultaneously achieves stealthiness across spatial, spectral, and semantic domains, which is an underexplored but meaningful direction. Experiments are performed on multiple datasets and models, showing the generality of the method. The paper provides clear algorithmic descriptions, ablation studies, and defense-resistance analyses, enhancing reproducibility and technical depth.
Although the paper claims semantic invisibility, the evaluation mainly relies on Grad-CAM visualization and AC/NC detection. More quantitative semantic similarity metrics (e.g., feature-space distance, neuron activation overlap) would strengthen the claim. Some compared methods are relatively dated. Including more recent backdoor attacks would make comparisons more convincing. The manuscript is lengthy, with excessive large figures and overlapping content between the main text and appendix, wh
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · COVID-19 diagnosis using AI · Network Security and Intrusion Detection
