Pulling Back the Curtain: Unsupervised Adversarial Detection via Contrastive Auxiliary Networks
Eylon Mizrahi, Raz Lapid, Moshe Sipper

TL;DR
This paper introduces U-CAN, an unsupervised method that detects adversarial attacks by analyzing auxiliary feature representations within deep models, without requiring adversarial examples, thus improving security in critical applications.
Contribution
The paper presents U-CAN, a novel unsupervised adversarial detection framework using contrastive auxiliary networks embedded in intermediate layers, outperforming existing methods across multiple datasets and architectures.
Findings
U-CAN achieves higher F1 scores than existing unsupervised detectors.
The method is effective across diverse datasets and model architectures.
U-CAN does not require adversarial examples for training or detection.
Abstract
Deep learning models are widely employed in safety-critical applications yet remain susceptible to adversarial attacks -- imperceptible perturbations that can significantly degrade model performance. Conventional defense mechanisms predominantly focus on either enhancing model robustness or detecting adversarial inputs independently. In this work, we propose an Unsupervised adversarial detection via Contrastive Auxiliary Networks (U-CAN) to uncover adversarial behavior within auxiliary feature representations, without the need for adversarial examples. U-CAN is embedded within selected intermediate layers of the target model. These auxiliary networks, comprising projection layers and ArcFace-based linear layers, refine feature representations to more effectively distinguish between benign and adversarial inputs. Comprehensive experiments across multiple datasets (CIFAR-10, Mammals, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Network Security and Intrusion Detection
MethodsVGG-16 · Focus
