ProP: Efficient Backdoor Detection via Propagation Perturbation for Overparametrized Models
Tao Ren, Qiongxiu Li

TL;DR
ProP is a scalable, assumption-light backdoor detection method for overparameterized models that uses output distribution analysis to identify malicious models efficiently and accurately.
Contribution
ProP introduces a novel propagation perturbation technique and benign score metric for effective backdoor detection without prior trigger knowledge.
Findings
High detection accuracy across multiple attack types
Superior computational efficiency compared to existing methods
Effective in real-world scenarios with minimal assumptions
Abstract
Backdoor attacks pose significant challenges to the security of machine learning models, particularly for overparameterized models like deep neural networks. In this paper, we propose ProP (Propagation Perturbation), a novel and scalable backdoor detection method that leverages statistical output distributions to identify backdoored models and their target classes without relying on exhausive optimization strategies. ProP introduces a new metric, the benign score, to quantify output distributions and effectively distinguish between benign and backdoored models. Unlike existing approaches, ProP operates with minimal assumptions, requiring no prior knowledge of triggers or malicious samples, making it highly applicable to real-world scenarios. Extensive experimental validation across multiple popular backdoor attacks demonstrates that ProP achieves high detection accuracy and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReal-time simulation and control systems · Anomaly Detection Techniques and Applications · Embedded Systems Design Techniques
