Bounding-box Watermarking: Defense against Model Extraction Attacks on Object Detectors
Satoru Koda, Ikuya Morikawa

TL;DR
This paper introduces a novel bounding-box watermarking technique to defend object detection models against model extraction attacks by stealthily embedding backdoors into API responses, achieving perfect detection accuracy.
Contribution
It presents a new bounding-box based backdoor watermarking method specifically designed for object detection models to prevent model extraction attacks.
Findings
Achieved 100% accuracy in identifying extracted models.
Effective across multiple datasets and scenarios.
Maintains object detection performance while embedding watermarks.
Abstract
Deep neural networks (DNNs) deployed in a cloud often allow users to query models via the APIs. However, these APIs expose the models to model extraction attacks (MEAs). In this attack, the attacker attempts to duplicate the target model by abusing the responses from the API. Backdoor-based DNN watermarking is known as a promising defense against MEAs, wherein the defender injects a backdoor into extracted models via API responses. The backdoor is used as a watermark of the model; if a suspicious model has the watermark (i.e., backdoor), it is verified as an extracted model. This work focuses on object detection (OD) models. Existing backdoor attacks on OD models are not applicable for model watermarking as the defense against MEAs on a realistic threat model. Our proposed approach involves inserting a backdoor into extracted models via APIs by stealthily modifying the bounding-boxes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Steganography and Watermarking Techniques · Biometric Identification and Security · Chaos-based Image/Signal Encryption
