Route to Rome Attack: Directing LLM Routers to Expensive Models via Adversarial Suffix Optimization

Haochun Tang; Yuliang Yan; Jiahua Lu; Huaxiao Liu; Enyan Dai

arXiv:2604.15022·cs.CR·April 17, 2026

Route to Rome Attack: Directing LLM Routers to Expensive Models via Adversarial Suffix Optimization

Haochun Tang, Yuliang Yan, Jiahua Lu, Huaxiao Liu, Enyan Dai

PDF

1 Repo

TL;DR

This paper introduces R$^2$A, an adversarial attack method that misleads black-box LLM routers into selecting more expensive models, highlighting security vulnerabilities in cost-aware routing systems.

Contribution

It presents a novel black-box attack technique using surrogate models and suffix optimization to manipulate LLM routing strategies.

Findings

01

R$^2$A significantly increases routing to expensive models across various systems.

02

The attack is effective on both open-source and commercial routing systems.

03

The method works without white-box access or heuristic prompts.

Abstract

Cost-aware routing dynamically dispatches user queries to models of varying capability to balance performance and inference cost. However, the routing strategy introduces a new security concern that adversaries may manipulate the router to consistently select expensive high-capability models. Existing routing attacks depend on either white-box access or heuristic prompts, rendering them ineffective in real-world black-box scenarios. In this work, we propose R $^{2}$ A, which aims to mislead black-box LLM routers to expensive models via adversarial suffix optimization. Specifically, R $^{2}$ A deploys a hybrid ensemble surrogate router to mimic the black-box router. A suffix optimization algorithm is further adapted for the ensemble-based surrogate. Extensive experiments on multiple open-source and commercial routing systems demonstrate that {R $^{2}$ A} significantly increases the routing rate to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

thcxiker/R2A-Attack
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.