BDFA: A Blind Data Adversarial Bit-flip Attack on Deep Neural Networks
Behnam Ghavami, Mani Sadati, Mohammad Shahidzadeh, Zhenman Fang,, Lesley Shannon

TL;DR
This paper introduces BDFA, a novel adversarial attack method that flips bits in neural network weights without needing access to training or test data, significantly degrading model accuracy.
Contribution
BDFA is the first technique to perform data-independent bit-flip attacks by optimizing synthetic data to match network statistics.
Findings
Decreases ResNet50 accuracy from 75.96% to 13.94%.
Achieves effective attacks with only 4 bit flips.
Operates without access to original training or test data.
Abstract
Adversarial bit-flip attack (BFA) on Neural Network weights can result in catastrophic accuracy degradation by flipping a very small number of bits. A major drawback of prior bit flip attack techniques is their reliance on test data. This is frequently not possible for applications that contain sensitive or proprietary data. In this paper, we propose Blind Data Adversarial Bit-flip Attack (BDFA), a novel technique to enable BFA without any access to the training or testing data. This is achieved by optimizing for a synthetic dataset, which is engineered to match the statistics of batch normalization across different layers of the network and the targeted label. Experimental results show that BDFA could decrease the accuracy of ResNet50 significantly from 75.96\% to 13.94\% with only 4 bits flips.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Integrated Circuits and Semiconductor Failure Analysis · Advancements in Semiconductor Devices and Circuit Design
MethodsFLIP · Batch Normalization
