Toward Patch Robustness Certification and Detection for Deep Learning Systems Beyond Consistent Samples
Qilin Zhou, Zhengyuan Wei, Haipeng Wang, Zhuo Wang, W.K. Chan

TL;DR
This paper introduces HiCert, a novel masking-based certification method that enhances patch robustness detection in deep learning, effectively certifying inconsistent and consistent samples with improved accuracy and lower false silent ratios.
Contribution
HiCert is the first comprehensive method to certify patch robustness for both inconsistent and consistent samples in deep learning systems.
Findings
Certifies significantly more benign samples including inconsistent ones
Achieves higher accuracy on samples without warnings
Reduces false silent ratio substantially
Abstract
Patch robustness certification is an emerging kind of provable defense technique against adversarial patch attacks for deep learning systems. Certified detection ensures the detection of all patched harmful versions of certified samples, which mitigates the failures of empirical defense techniques that could (easily) be compromised. However, existing certified detection methods are ineffective in certifying samples that are misclassified or whose mutants are inconsistently pre icted to different labels. This paper proposes HiCert, a novel masking-based certified detection technique. By focusing on the problem of mutants predicted with a label different from the true label with our formal analysis, HiCert formulates a novel formal relation between harmful samples generated by identified loopholes and their benign counterparts. By checking the bound of the maximum confidence among these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Security and Verification in Computing · Physical Unclonable Functions (PUFs) and Hardware Security
