SafePickle: Robust and Generic ML Detection of Malicious Pickle-based ML Models
Hillel Ohayon, Daniel Gilkarov, Ran Dubin

TL;DR
SafePickle introduces a machine-learning approach to detect malicious pickle-based ML models, achieving high accuracy and robustness against evasive attacks without complex system setups or policy generation.
Contribution
It presents a lightweight, library-agnostic detection method that outperforms state-of-the-art scanners and effectively identifies evasive malicious pickle files.
Findings
Achieves 90.01% F1-score on our dataset.
Outperforms existing scanners on multiple datasets.
Successfully detects all evasive malicious models.
Abstract
Model repositories such as Hugging Face increasingly distribute machine learning artifacts serialized with Python's pickle format, exposing users to remote code execution (RCE) risks during model loading. Recent defenses, such as PickleBall, rely on per-library policy synthesis that requires complex system setups and verified benign models, which limits scalability and generalization. In this work, we propose a lightweight, machine-learning-based scanner that detects malicious Pickle-based files without policy generation or code instrumentation. Our approach statically extracts structural and semantic features from Pickle bytecode and applies supervised and unsupervised models to classify files as benign or malicious. We construct and release a labeled dataset of 727 Pickle-based files from Hugging Face and evaluate our models on four datasets: our own, PickleBall (out-of-distribution),…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Security and Verification in Computing · Software Testing and Debugging Techniques
