When Efficiency Backfires: Cascading LLMs Trigger Cascade Failure under Adversarial Attack
Zehan Sun, Dingfan Chen, Songze Li

TL;DR
This paper reveals that LLM cascade systems, designed for efficiency, are vulnerable to adversarial attacks that can degrade both their performance and cost-effectiveness, highlighting systemic security risks.
Contribution
It introduces the first targeted adversarial attack framework specifically designed for LLM cascade systems, exploiting their structure to cause significant performance and efficiency degradation.
Findings
Adversarial manipulation can disrupt cascade system performance and cost benefits.
The proposed attack framework adapts to different adversary capabilities.
Experiments demonstrate the attack's effectiveness across various datasets.
Abstract
Large Language Model (LLM) cascade systems are designed to balance efficiency and performance by processing queries with lightweight models while selectively escalating complex cases to more powerful ones. Such systems seek to reduces computational cost and latency while maintaining task performance, making it an appealing choice for large-scale deployment. However, the cascade design introduces new vulnerabilities through an expanded attack surface: the inclusion of lightweight front-end models and internal decision mechanisms introduces new weaknesses. In this work, we present the first study demonstrating that LLM cascade systems are susceptible to targeted adversarial manipulation, which disrupts both performance objectives and the intended cost advantages of the cascade design. We propose a novel attack framework that employs constrained sequential collaborative optimization of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
