LLMEffiChecker: Understanding and Testing Efficiency Degradation of Large Language Models
Xiaoning Feng, Xiaohong Han, Simin Chen, Wei Yang

TL;DR
This paper introduces ool, a method to test and understand the efficiency robustness of large language models by generating adversarial inputs that significantly delay response generation, revealing potential vulnerabilities.
Contribution
The paper presents a novel approach for testing LLM efficiency robustness using gradient-guided and causal inference techniques in white-box and black-box settings, respectively.
Findings
verage latency increase of 325244% using minimal perturbations
verage energy consumption increase of 344616%
Effective delay of EOS generation in nine public LLMs.
Abstract
In this paper, we make the first attempt to understand and test potential computation efficiency robustness in state-of-the-art LLMs. By analyzing the working mechanism and implementation of 20,543 public-accessible LLMs, we observe a fundamental property in LLMs that could be manipulated in an adversarial manner to reduce computation efficiency significantly. Our key motivation is to generate test inputs that could sufficiently delay the generation of EOS such that LLMs would have to go through enough iterations to satisfy the pre-configured threshold. We present \tool, which can work under both white-box setting and black-box setting. In the white-box scenario, \tool develops a gradient-guided technique that searches for a minimal and unnoticeable perturbation at character-level, token-level, and structure-level. In the black-box scenario, \tool employs a causal inference-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFerroelectric and Negative Capacitance Devices · Adversarial Robustness in Machine Learning · Fuel Cells and Related Materials
MethodsMulti-Head Attention · Attention Is All You Need · Test · Linear Layer · Byte Pair Encoding · Residual Connection · Adafactor · Attention Dropout · SentencePiece · Dense Connections
