Ownership Verification of DNN Models Using White-Box Adversarial Attacks with Specified Probability Manipulation
Teruki Sano, Minoru Kuribayashi, Masao Sakai, Shuji Isobe, Eisuke Koizumi

TL;DR
This paper introduces a framework for verifying ownership of DNN models by using white-box adversarial attacks to manipulate output probabilities, enabling model identification without exposing the original model.
Contribution
It presents a novel adversarial attack-based method for ownership verification of DNNs in gray-box scenarios, which is simple, effective, and does not require the original model during verification.
Findings
The proposed attack effectively aligns output probabilities to verify ownership.
Experimental results demonstrate high accuracy in model identification.
The method works in gray-box scenarios with black-box access during deployment.
Abstract
In this paper, we propose a novel framework for ownership verification of deep neural network (DNN) models for image classification tasks. It allows verification of model identity by both the rightful owner and third party without presenting the original model. We assume a gray-box scenario where an unauthorized user owns a model that is illegally copied from the original model, provides services in a cloud environment, and the user throws images and receives the classification results as a probability distribution of output classes. The framework applies a white-box adversarial attack to align the output probability of a specific class to a designated value. Due to the knowledge of original model, it enables the owner to generate such adversarial examples. We propose a simple but effective adversarial attack method based on the iterative Fast Gradient Sign Method (FGSM) by introducing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsALIGN
