Certified but Fooled! Breaking Certified Defences with Ghost Certificates

Quoc Viet Vo; Tashreque M. Haq; Paul Montague; Tamas Abraham; Ehsan Abbasnejad; Damith C. Ranasinghe

arXiv:2511.14003·cs.LG·November 19, 2025

Certified but Fooled! Breaking Certified Defences with Ghost Certificates

Quoc Viet Vo, Tashreque M. Haq, Paul Montague, Tamas Abraham, Ehsan Abbasnejad, Damith C. Ranasinghe

PDF

Open Access

TL;DR

This paper reveals how adversaries can exploit probabilistic certification frameworks to generate imperceptible perturbations that spoof robustness guarantees, effectively bypassing state-of-the-art certified defenses like Densepure.

Contribution

It demonstrates the feasibility of crafting small, region-focused adversarial examples that deceive certified defenses, exposing vulnerabilities in current robustness certification methods.

Findings

01

Effective bypass of Densepure certification using ghost certificates

02

Imperceptible perturbations can generate false robustness guarantees

03

Extensive evaluations on ImageNet validate the attack method

Abstract

Certified defenses promise provable robustness guarantees. We study the malicious exploitation of probabilistic certification frameworks to better understand the limits of guarantee provisions. Now, the objective is to not only mislead a classifier, but also manipulate the certification process to generate a robustness guarantee for an adversarial input certificate spoofing. A recent study in ICLR demonstrated that crafting large perturbations can shift inputs far into regions capable of generating a certificate for an incorrect class. Our study investigates if perturbations needed to cause a misclassification and yet coax a certified model into issuing a deceptive, large robustness radius for a target class can still be made small and imperceptible. We explore the idea of region-focused adversarial examples to craft imperceptible perturbations, spoof certificates and achieve…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning