Exploiting Leaderboards for Large-Scale Distribution of Malicious Models

Anshuman Suri; Harsh Chaudhari; Yuefeng Peng; Ali Naseh; Amir Houmansadr; Alina Oprea

arXiv:2507.08983·cs.LG·July 15, 2025

Exploiting Leaderboards for Large-Scale Distribution of Malicious Models

Anshuman Suri, Harsh Chaudhari, Yuefeng Peng, Ali Naseh, Amir Houmansadr, Alina Oprea

PDF

Open Access

TL;DR

This paper reveals how adversaries can exploit model leaderboards to distribute poisoned models at scale, demonstrating a framework that embeds malicious behaviors while maintaining high leaderboard rankings across multiple modalities.

Contribution

The paper introduces TrojanClimb, a novel framework enabling malicious model injection that preserves leaderboard performance, exposing a critical security vulnerability in model evaluation platforms.

Findings

01

Adversaries can achieve high leaderboard rankings with poisoned models.

02

TrojanClimb is effective across text, speech, and image modalities.

03

Leaderboard systems are vulnerable to large-scale malicious model distribution.

Abstract

While poisoning attacks on machine learning models have been extensively studied, the mechanisms by which adversaries can distribute poisoned models at scale remain largely unexplored. In this paper, we shed light on how model leaderboards -- ranked platforms for model discovery and evaluation -- can serve as a powerful channel for adversaries for stealthy large-scale distribution of poisoned models. We present TrojanClimb, a general framework that enables injection of malicious behaviors while maintaining competitive leaderboard performance. We demonstrate its effectiveness across four diverse modalities: text-embedding, text-generation, text-to-speech and text-to-image, showing that adversaries can successfully achieve high leaderboard rankings while embedding arbitrary harmful functionalities, from backdoors to bias injection. Our findings reveal a significant vulnerability in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNetwork Security and Intrusion Detection · Advanced Malware Detection Techniques · Anomaly Detection Techniques and Applications