Architecture-Agnostic Feature Synergy for Universal Defense Against Heterogeneous Generative Threats
Bingxue Zhang, Yang Gao, Feida Zhu, Yanyan Shen, Yang Shi

TL;DR
This paper introduces ATFS, a universal defense framework that aligns high-level features across diverse generative models to effectively protect against heterogeneous threats, overcoming limitations of architecture-specific defenses.
Contribution
We propose a novel, architecture-agnostic feature alignment method that enhances robustness against diverse generative threats without complex modifications.
Findings
Achieves state-of-the-art protection in heterogeneous scenarios
Converges rapidly within 40 iterations to over 90% performance
Maintains robustness under compression and scaling attacks
Abstract
Generative AI deployment poses unprecedented challenges to content safety and privacy. However, existing defense mechanisms are often tailored to specific architectures (e.g., Diffusion Models or GANs), creating fragile "defense silos" that fail against heterogeneous generative threats. This paper identifies a fundamental optimization barrier in naive pixel-space ensemble strategies: due to divergent objective functions, pixel-level gradients from heterogeneous generators become statistically orthogonal, causing destructive interference. To overcome this, we observe that despite disparate low-level mechanisms, high-level feature representations of generated content exhibit alignment across architectures. Based on this, we propose the Architecture-Agnostic Targeted Feature Synergy (ATFS) framework. By introducing a target guidance image, ATFS reformulates multi-model defense as a unified…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Security and Verification in Computing
