TL;DR
ModelDiff is a testing-based method that compares deep learning models' behavioral patterns on test inputs to detect model reuse, addressing challenges of non-white-box models and different tasks.
Contribution
Proposes ModelDiff, a novel behavioral pattern comparison approach using decision distance vectors and cosine similarity for model reuse detection.
Findings
Achieved 91.7% correctness on a comprehensive benchmark.
Effective in detecting transfer learning, model compression, and model stealing.
Demonstrated feasibility on real-world mobile deep learning models.
Abstract
The knowledge of a deep learning model may be transferred to a student model, leading to intellectual property infringement or vulnerability propagation. Detecting such knowledge reuse is nontrivial because the suspect models may not be white-box accessible and/or may serve different tasks. In this paper, we propose ModelDiff, a testing-based approach to deep learning model similarity comparison. Instead of directly comparing the weights, activations, or outputs of two models, we compare their behavioral patterns on the same set of test inputs. Specifically, the behavioral pattern of a model is represented as a decision distance vector (DDV), in which each element is the distance between the model's reactions to a pair of inputs. The knowledge similarity between two models is measured with the cosine similarity between their DDVs. To evaluate ModelDiff, we created a benchmark that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
