Get a Model! Model Hijacking Attack Against Machine Learning Models
Ahmed Salem, Michael Backes, Yang Zhang

TL;DR
This paper introduces a novel stealthy training time attack called model hijacking, which manipulates computer vision models to perform different tasks without detection, posing security and accountability risks.
Contribution
The authors propose the first model hijacking attack using a new encoder-decoder model, demonstrating high success rates with minimal utility loss.
Findings
High attack success rate achieved
Stealthy attack samples resemble original data
Minimal impact on model utility
Abstract
Machine learning (ML) has established itself as a cornerstone for various critical applications ranging from autonomous driving to authentication systems. However, with this increasing adoption rate of machine learning models, multiple attacks have emerged. One class of such attacks is training time attack, whereby an adversary executes their attack before or during the machine learning model training. In this work, we propose a new training time attack against computer vision based machine learning models, namely model hijacking attack. The adversary aims to hijack a target model to execute a different task than its original one without the model owner noticing. Model hijacking can cause accountability and security risks since a hijacked model owner can be framed for having their model offering illegal or unethical services. Model hijacking attacks are launched in the same way as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques
