Model X-Ray: Detection of Hidden Malware in AI Model Weights using Few Shot Learning
Daniel Gilkarov, Ran Dubin

TL;DR
This paper introduces a novel few-shot learning approach to detect hidden malware in AI model weights by transforming the problem into an image classification task, significantly reducing training data requirements and improving detection of subtle steganographic attacks.
Contribution
The study presents a new application of few-shot learning to AI model security, enabling effective malware detection with minimal training data and robustness against diverse steganography techniques.
Findings
Reduces training dataset from 40,000 to 6 models
Detects attacks with embedding rates as low as 6%
Successfully identifies novel spread-spectrum steganography
Abstract
The potential for exploitation of AI models has increased due to the rapid advancement of Artificial Intelligence (AI) and the widespread use of platforms like Model Zoo for sharing AI models. Attackers can embed malware within AI models through steganographic techniques, taking advantage of the substantial size of these models to conceal malicious data and use it for nefarious purposes, e.g. Remote Code Execution. Ensuring the security of AI models is a burgeoning area of research essential for safeguarding the multitude of organizations and users relying on AI technologies. This study leverages well-studied image few-shot learning techniques by transferring the AI models to the image field using a novel image representation. Applying few-shot learning in this field enables us to create practical models, a feat that previous works lack. Our method addresses critical limitations in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Anomaly Detection Techniques and Applications
