Towards Provable (In)Secure Model Weight Release Schemes
Xin Yang, Bintao Tang, Yuhao Wang, Zimo Ji, Terry Jingchen Zhang, Wenyuan Jiang

TL;DR
This paper formalizes security definitions for weight release schemes in machine learning, evaluates a prominent scheme, and uncovers vulnerabilities, emphasizing the need for rigorous security foundations.
Contribution
It introduces concrete security definitions for weight release schemes and demonstrates their application by analyzing and exposing vulnerabilities in TaylorMLP.
Findings
TaylorMLP fails to prevent parameter extraction
Formal security definitions are essential for weight release schemes
The paper advocates for rigorous security evaluation in ML models
Abstract
Recent secure weight release schemes claim to enable open-source model distribution while protecting model ownership and preventing misuse. However, these approaches lack rigorous security foundations and provide only informal security guarantees. Inspired by established works in cryptography, we formalize the security of weight release schemes by introducing several concrete security definitions. We then demonstrate our definition's utility through a case study of TaylorMLP, a prominent secure weight release scheme. Our analysis reveals vulnerabilities that allow parameter extraction thus showing that TaylorMLP fails to achieve its informal security goals. We hope this work will advocate for rigorous research at the intersection of machine learning and security communities and provide a blueprint for how future weight release schemes should be designed and evaluated.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Smart Grid Security and Resilience · Network Security and Intrusion Detection
