Rerouting LLM Routers

Avital Shafran; Roei Schuster; Thomas Ristenpart; Vitaly Shmatikov

arXiv:2501.01818·cs.CR·January 6, 2025

Rerouting LLM Routers

Avital Shafran, Roei Schuster, Thomas Ristenpart, Vitaly Shmatikov

PDF

Open Access

TL;DR

This paper explores the adversarial vulnerabilities of LLM routers, showing how malicious inputs can manipulate routing decisions without degrading response quality, and discusses potential defenses.

Contribution

It introduces the concept of LLM control plane integrity, demonstrates a novel attack method using confounder gadgets, and evaluates their effectiveness and defenses.

Findings

01

Adversaries can manipulate LLM routing decisions using confounder gadgets.

02

Confounder gadgets do not impact LLM response quality.

03

Perplexity-based filtering is ineffective against these attacks.

Abstract

LLM routers aim to balance quality and cost of generation by classifying queries and routing them to a cheaper or more expensive LLM depending on their complexity. Routers represent one type of what we call LLM control planes: systems that orchestrate use of one or more LLMs. In this paper, we investigate routers' adversarial robustness. We first define LLM control plane integrity, i.e., robustness of LLM orchestration to adversarial inputs, as a distinct problem in AI safety. Next, we demonstrate that an adversary can generate query-independent token sequences we call ``confounder gadgets'' that, when added to any query, cause LLM routers to send the query to a strong LLM. Our quantitative evaluation shows that this attack is successful both in white-box and black-box settings against a variety of open-source and commercial routers, and that confounding queries do not affect the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Storage Technologies · Network Packet Processing and Optimization · Interconnection Networks and Systems