PickleFuzzer: A Case Study in Fuzzing for Discrepancies Between Python Pickle Implementations
Justin Applegate, Andreas Kellas

TL;DR
PickleFuzzer is a custom fuzzing tool that detects discrepancies between Python's pickle implementations, uncovering security vulnerabilities and inconsistencies that can undermine untrusted data handling.
Contribution
We developed PickleFuzzer, a generation-based fuzzer that identifies implementation discrepancies without relying on a formal specification, revealing critical security issues.
Findings
Detected 14 discrepancies between pickle implementations.
Identified 4 critical discrepancies that can bypass security tools.
Disclosed security issues to Python Software Foundation and bug bounty platform.
Abstract
Python's native serialization protocol, pickle, is a powerful but insecure format for transferring untrusted data. It is frequently used, especially for saving machine learning models, despite known security challenges. While developers sometimes mitigate this risk by restricting imports during unpickling or using static and dynamic analysis tools, these approaches are error-prone and depend heavily on accurate interpretations of the Pickle Virtual Machine (PVM) opcodes. Discrepancies across Python's three native PVM modules can lead to incorrect detection of malicious payloads and undermine existing defenses. To efficiently and scalably identify discrepancies, we present PickleFuzzer, a custom generation-based fuzzer that identifies inconsistencies across pickle implementations. PickleFuzzer generates pickle objects, passes them to each implementation, and detects differences in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
