An Engorgio Prompt Makes Large Language Model Babble on
Jianshuo Dong, Ziyuan Zhang, Qingjie Zhang, Tianwei Zhang, Hao Wang,, Hewu Li, Qi Li, Chao Zhang, Ke Xu, and Han Qiu

TL;DR
This paper introduces Engorgio, a method to craft prompts that intentionally increase the computation cost and latency of large language models, exposing vulnerabilities in their inference process.
Contribution
The paper presents a novel technique for generating adversarial prompts that exploit the auto-regressive nature of LLMs to cause abnormally long outputs, threatening service availability.
Findings
Engorgio prompts can induce 2-13× longer outputs in 13 open-source LLMs.
The method successfully affects LLMs in white-box scenarios.
Real-world experiments demonstrate Engorgio's threat to LLM services.
Abstract
Auto-regressive large language models (LLMs) have yielded impressive performance in many real-world tasks. However, the new paradigm of these LLMs also exposes novel threats. In this paper, we explore their vulnerability to inference cost attacks, where a malicious user crafts Engorgio prompts to intentionally increase the computation cost and latency of the inference process. We design Engorgio, a novel methodology, to efficiently generate adversarial Engorgio prompts to affect the target LLM's service availability. Engorgio has the following two technical contributions. (1) We employ a parameterized distribution to track LLMs' prediction trajectory. (2) Targeting the auto-regressive nature of LLMs' inference process, we propose novel loss functions to stably suppress the appearance of the <EOS> token, whose occurrence will interrupt the LLM's generation process. We conduct extensive…
Peer Reviews
Decision·ICLR 2025 Poster
1. It is novel to design inference cost attacks against decoder-only LLMs via modeling the LLM’s inference trajectory to suppress the appearance of <EOS> token. 2. The paper uses extensive experiments to demonstrate the effectiveness and transferability of the method to increase output length for various LLMs. 3. The paper is well-written and easy to follow.
1. The paper does not specify how many prompts are sampled from the distribution in the experiments. The paper has limited discussions about the test stage of the generated prompts. Are the reported average lengths and rates robust to the sampling process? How many samples are generated from the proxy distribution in the experiment? 2. How does the optimization process initialize? Does it initialize from zero or random prompt? Could the authors also please provide some examples of the generated
The paper does a good job of explaining related work. The authors explain specifically how their work fits into existing work and makes it clear that there is a need for their contribution. The experiments are well setup. They include a good variety of models, different types of inputs/prompts, and the authors provide many setup/configuration/metric details that make their experiments highly reproducible. The real-world experiment is great! It is very helpful is demonstrating how effective the
Even though there is a great real-world experiment, it is just one experiment and it is hard to know in general how practical this attack is in the real world. It would be helpful to have an idea more generally about how long responses can be guaranteed to increase inference costs. It seems like this effect could be insignificant/trivial. The results mostly focus on Avg-len and Avg-rate, but how does this generally translate to increases in inference cost? And how does the increase in cost compa
1. This is the first paper studying inference cost attacks against modern LLMs. To achieve effective inference cost attacks, the authors analyze the challenges and propose the Engorgio method which can effectively and stably induce lengthy LLM responses. 2. Comprehensive experiments are conducted to demonstrate the effectiveness of Engorgiol. The authors even simulate a real-world attack case for LLM services on Hugging Face inference endpoint.
1. For most LLM servers, the deployed models are unknown to users. It is not practical to consider totally white-box settings. 2. Lack of experiments with baseline defense. Though Section 5 mentions the potential defense approaches, there are no experiments to demonstrate whether a simple filter like input prompt perplexity could largely reduce the proposed attack.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques
Methodstravel james
