Attack as Defense: Run-time Backdoor Implantation for Image Content Protection
Haichuan Zhang, Meiyu Lin, Zhaoyi Liu, Renyuan Li, Zhiyuan Cheng, Carl, Yang, Mingjie Tang

TL;DR
This paper introduces a novel run-time backdoor implantation method to protect sensitive image content from unauthorized modifications by triggering failure in image editing models, without requiring model retraining.
Contribution
It presents the first efficient framework for run-time backdoor implantation that safeguards image content by using imperceptible perturbations as triggers, minimizing impact on legitimate edits.
Findings
Significantly increases CLIP-FID scores under malicious editing.
Reduces SSIM in malicious edits, indicating effective protection.
Maintains minimal impact on benign image editing.
Abstract
As generative models achieve great success, tampering and modifying the sensitive image contents (i.e., human faces, artist signatures, commercial logos, etc.) have induced a significant threat with social impact. The backdoor attack is a method that implants vulnerabilities in a target model, which can be activated through a trigger. In this work, we innovatively prevent the abuse of image content modification by implanting the backdoor into image-editing models. Once the protected sensitive content on an image is modified by an editing model, the backdoor will be triggered, making the editing fail. Unlike traditional backdoor attacks that use data poisoning, to enable protection on individual images and eliminate the need for model training, we developed the first framework for run-time backdoor implantation, which is both time- and resource- efficient. We generate imperceptible…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Steganography and Watermarking Techniques · Digital Media Forensic Detection · Physical Unclonable Functions (PUFs) and Hardware Security
