Post-train Black-box Defense via Bayesian Boundary Correction

He Wang; Yunfeng Diao

arXiv:2306.16979·cs.CV·June 12, 2024·1 cites

Post-train Black-box Defense via Bayesian Boundary Correction

He Wang, Yunfeng Diao

PDF

Open Access

TL;DR

This paper introduces Bayesian Boundary Correction (BBC), a post-train black-box defense method that enhances the robustness of pre-trained classifiers against adversarial attacks without re-training.

Contribution

The paper presents a novel Bayesian framework for black-box adversarial defense that does not require re-training or model specifics, applicable to various data types.

Findings

01

BBC improves robustness against adversarial attacks.

02

BBC maintains high accuracy on clean data.

03

The framework is adaptable to different data modalities.

Abstract

Classifiers based on deep neural networks are susceptible to adversarial attack, where the widely existing vulnerability has invoked the research in defending them from potential threats. Given a vulnerable classifier, existing defense methods are mostly white-box and often require re-training the victim under modified loss functions/training regimes. While the model/data/training specifics of the victim are usually unavailable to the user, re-training is unappealing, if not impossible for reasons such as limited computational resources. To this end, we propose a new post-train black-box defense framework. It can turn any pre-trained classifier into a resilient one with little knowledge of the model specifics. This is achieved by new joint Bayesian treatments on the clean data, the adversarial examples and the classifier, for maximizing their joint probability. It is further equipped…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Cardiac Arrest and Resuscitation