SnatchML: Hijacking ML models without Training Access
Mahmoud Ghorbel, Halima Bouzidi, Ioan Marius Bilasco, Ihsen Alouani

TL;DR
SnatchML demonstrates a training-free inference-time model hijacking attack exploiting over-parameterization, enabling hijacking of models for more classes than the original task, with proposed mitigation strategies.
Contribution
Introduces SnatchML, a novel inference-time hijacking attack that leverages model over-parameterization without training access, and proposes mitigation techniques like meta-unlearning and model compression.
Findings
SnatchML achieves high accuracy in hijacking models on AWS Sagemaker.
It can hijack models for tasks with more classes than the original.
Proposed countermeasures can mitigate the hijacking risk.
Abstract
Model hijacking can cause significant accountability and security risks since the owner of a hijacked model can be framed for having their model offer illegal or unethical services. Prior works consider model hijacking as a training time attack, whereby an adversary requires full access to the ML model training. In this paper, we consider a stronger threat model for an inference-time hijacking attack, where the adversary has no access to the training phase of the victim model. Our intuition is that ML models, which are typically over-parameterized, might have the capacity to (unintentionally) learn more than the intended task they are trained for. We propose SnatchML, a new training-free model hijacking attack, that leverages the extra capacity learnt by the victim model to infer different tasks that can be semantically related or unrelated to the original one. Our results on models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSimulation Techniques and Applications
