Fast and Lightweight Backdoor Detection via Head Random Probing
Yinbo Yu, Xueyu Yin, Jing Fang, Chunwei Tian, Qi Zhu, Jiajia Liu, Daoqiang Zhang

TL;DR
HTell is a fast, data-free backdoor detection method for neural networks that uses head random probing to identify abnormal responses, significantly reducing detection time.
Contribution
The paper introduces HTell, a novel lightweight backdoor detector that does not require data or gradients, enabling efficient large-scale model auditing.
Findings
Achieves 99.03% true positive rate with 2.11% false positive rate.
Reduces detection latency by over 30,000 times compared to gradient-based methods.
Effective across diverse datasets, architectures, and attack types.
Abstract
Deep neural networks (DNNs) remain critically vulnerable to backdoor attacks. Existing post-training detectors often require clean or surrogate data, gradients, or iterative trigger reconstruction, leading to high computational costs and limited robustness under practical model-auditing scenarios. In this paper, we propose HTell, a fast and lightweight data-free backdoor detector based on head random probing. Instead of reconstructing diverse trigger patterns, HTell inspects their unified manifestation in the prediction head: backdoored models tend to exhibit abnormal response concentration on the target class under random latent probes. HTell generates architecture-aware random latent probes, feeds them directly into the model head, and detects backdoors by analyzing class-wise response statistics, without accessing real or surrogate data, model gradients, or parameter optimization. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
