FeatureLens: A Highly Generalizable and Interpretable Framework for Detecting Adversarial Examples Based on Image Features

Zhigang Yang; Yuan Liu; Jiawei Zhang; Puning Zhang; Xinqiang Ma

arXiv:2512.03625·cs.CV·December 4, 2025

FeatureLens: A Highly Generalizable and Interpretable Framework for Detecting Adversarial Examples Based on Image Features

Zhigang Yang, Yuan Liu, Jiawei Zhang, Puning Zhang, Xinqiang Ma

PDF

Open Access

TL;DR

FeatureLens is a lightweight, interpretable framework that effectively detects adversarial examples in images by analyzing features, achieving high accuracy and strong generalization across various attack types.

Contribution

It introduces a simple, generalizable, and interpretable approach using image features and shallow classifiers for adversarial example detection.

Findings

01

Detection accuracy up to 99.75% in closed-set scenarios

02

High generalization across multiple attack types

03

Low model complexity with 1,000 to 30,000 parameters

Abstract

Although the remarkable performance of deep neural networks (DNNs) in image classification, their vulnerability to adversarial attacks remains a critical challenge. Most existing detection methods rely on complex and poorly interpretable architectures, which compromise interpretability and generalization. To address this, we propose FeatureLens, a lightweight framework that acts as a lens to scrutinize anomalies in image features. Comprising an Image Feature Extractor (IFE) and shallow classifiers (e.g., SVM, MLP, or XGBoost) with model sizes ranging from 1,000 to 30,000 parameters, FeatureLens achieves high detection accuracy ranging from 97.8% to 99.75% in closed-set evaluation and 86.17% to 99.6% in generalization evaluation across FGSM, PGD, CW, and DAmageNet attacks, using only 51 dimensional features. By combining strong detection performance with excellent generalization,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Digital Media Forensic Detection