LLMEffiChecker: Understanding and Testing Efficiency Degradation of   Large Language Models

Xiaoning Feng; Xiaohong Han; Simin Chen; Wei Yang

arXiv:2210.03696·cs.CL·May 28, 2024

LLMEffiChecker: Understanding and Testing Efficiency Degradation of Large Language Models

Xiaoning Feng, Xiaohong Han, Simin Chen, Wei Yang

PDF

Open Access 1 Repo

TL;DR

This paper introduces ool, a method to test and understand the efficiency robustness of large language models by generating adversarial inputs that significantly delay response generation, revealing potential vulnerabilities.

Contribution

The paper presents a novel approach for testing LLM efficiency robustness using gradient-guided and causal inference techniques in white-box and black-box settings, respectively.

Findings

01

verage latency increase of 325244% using minimal perturbations

02

verage energy consumption increase of 344616%

03

Effective delay of EOS generation in nine public LLMs.

Abstract

In this paper, we make the first attempt to understand and test potential computation efficiency robustness in state-of-the-art LLMs. By analyzing the working mechanism and implementation of 20,543 public-accessible LLMs, we observe a fundamental property in LLMs that could be manipulated in an adversarial manner to reduce computation efficiency significantly. Our key motivation is to generate test inputs that could sufficiently delay the generation of EOS such that LLMs would have to go through enough iterations to satisfy the pre-configured threshold. We present \tool, which can work under both white-box setting and black-box setting. In the white-box scenario, \tool develops a gradient-guided technique that searches for a minimal and unnoticeable perturbation at character-level, token-level, and structure-level. In the black-box scenario, \tool employs a causal inference-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

seekingdream/nmtsloth
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFerroelectric and Negative Capacitance Devices · Adversarial Robustness in Machine Learning · Fuel Cells and Related Materials

MethodsMulti-Head Attention · Attention Is All You Need · Test · Linear Layer · Byte Pair Encoding · Residual Connection · Adafactor · Attention Dropout · SentencePiece · Dense Connections