Pulling Back the Curtain: Unsupervised Adversarial Detection via Contrastive Auxiliary Networks

Eylon Mizrahi; Raz Lapid; Moshe Sipper

arXiv:2502.09110·cs.CV·October 28, 2025

Pulling Back the Curtain: Unsupervised Adversarial Detection via Contrastive Auxiliary Networks

Eylon Mizrahi, Raz Lapid, Moshe Sipper

PDF

Open Access

TL;DR

This paper introduces U-CAN, an unsupervised method that detects adversarial attacks by analyzing auxiliary feature representations within deep models, without requiring adversarial examples, thus improving security in critical applications.

Contribution

The paper presents U-CAN, a novel unsupervised adversarial detection framework using contrastive auxiliary networks embedded in intermediate layers, outperforming existing methods across multiple datasets and architectures.

Findings

01

U-CAN achieves higher F1 scores than existing unsupervised detectors.

02

The method is effective across diverse datasets and model architectures.

03

U-CAN does not require adversarial examples for training or detection.

Abstract

Deep learning models are widely employed in safety-critical applications yet remain susceptible to adversarial attacks -- imperceptible perturbations that can significantly degrade model performance. Conventional defense mechanisms predominantly focus on either enhancing model robustness or detecting adversarial inputs independently. In this work, we propose an Unsupervised adversarial detection via Contrastive Auxiliary Networks (U-CAN) to uncover adversarial behavior within auxiliary feature representations, without the need for adversarial examples. U-CAN is embedded within selected intermediate layers of the target model. These auxiliary networks, comprising projection layers and ArcFace-based linear layers, refine feature representations to more effectively distinguish between benign and adversarial inputs. Comprehensive experiments across multiple datasets (CIFAR-10, Mammals, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Network Security and Intrusion Detection

MethodsVGG-16 · Focus