Hardly Perceptible Trojan Attack against Neural Networks with Bit Flips
Jiawang Bai, Kuofeng Gao, Dihong Gong, Shu-Tao Xia, Zhifeng Li, and, Wei Liu

TL;DR
This paper introduces a novel Trojan attack on neural networks that creates nearly imperceptible trigger images using noise and pixel flow, achieving effective attacks without obvious patches.
Contribution
The paper proposes the hardly perceptible Trojan attack (HPT) that generates subtle trigger images by optimizing noise and pixel flow, improving stealthiness over existing patch-based methods.
Findings
HPT produces nearly imperceptible Trojan images.
HPT achieves comparable or better attack success rates.
Effective optimization algorithm handles binary weight constraints.
Abstract
The security of deep neural networks (DNNs) has attracted increasing attention due to their widespread use in various applications. Recently, the deployed DNNs have been demonstrated to be vulnerable to Trojan attacks, which manipulate model parameters with bit flips to inject a hidden behavior and activate it by a specific trigger pattern. However, all existing Trojan attacks adopt noticeable patch-based triggers (e.g., a square pattern), making them perceptible to humans and easy to be spotted by machines. In this paper, we present a novel attack, namely hardly perceptible Trojan attack (HPT). HPT crafts hardly perceptible Trojan images by utilizing the additive noise and per pixel flow field to tweak the pixel values and positions of the original images, respectively. To achieve superior attack performance, we propose to jointly optimize bit flips, additive noise, and flow field.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis
