An Empirical Study on Remote Code Execution in Machine Learning Model Hosting Ecosystems

Mohammed Latif Siddiq; Tanzim Hossain Romel; Natalie Sekerak; Beatrice Casey; Joanna C. S. Santos

arXiv:2601.14163·cs.SE·January 21, 2026

An Empirical Study on Remote Code Execution in Machine Learning Model Hosting Ecosystems

Mohammed Latif Siddiq, Tanzim Hossain Romel, Natalie Sekerak, Beatrice Casey, Joanna C. S. Santos

PDF

Open Access

TL;DR

This empirical study investigates the security risks of executing untrusted code during model loading in popular machine learning model-sharing platforms, revealing widespread unsafe practices and developer misconceptions.

Contribution

First large-scale empirical analysis of custom model loading practices, security risks, and developer perceptions across major ML model-sharing ecosystems.

Findings

01

Widespread reliance on unsafe defaults in model loading

02

Uneven security enforcement across platforms

03

Persistent developer misconceptions about security risks

Abstract

Model-sharing platforms, such as Hugging Face, ModelScope, and OpenCSG, have become central to modern machine learning development, enabling developers to share, load, and fine-tune pre-trained models with minimal effort. However, the flexibility of these ecosystems introduces a critical security concern: the execution of untrusted code during model loading (i.e., via trust_remote_code or trust_repo). In this work, we conduct the first large-scale empirical study of custom model loading practices across five major model-sharing platforms to assess their prevalence, associated risks, and developer perceptions. We first quantify the frequency with which models require custom code to function and identify those that execute arbitrary Python files during loading. We then apply three complementary static analysis tools: Bandit, CodeQL, and Semgrep, to detect security smells and potential…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Malware Detection Techniques · Adversarial Robustness in Machine Learning · Scientific Computing and Data Management