The Early Bird Catches the Leak: Unveiling Timing Side Channels in LLM Serving Systems
Linke Song, Zixuan Pang, Wenhao Wang, Zihao Wang, XiaoFeng Wang, Hongbo Chen, Wei Song, Yier Jin, Dan Meng, Rui Hou

TL;DR
This paper uncovers new timing side channels in LLM serving systems caused by shared caches and GPU memory, demonstrating how attackers can infer private prompts and system prompts, highlighting critical privacy vulnerabilities.
Contribution
It introduces the first discovery of timing side channels in LLM systems, along with novel attack strategies exploiting cache behaviors to infer confidential prompts.
Findings
Timing side channels exist in LLM systems due to shared caches.
Attack methods can accurately infer private prompts and shared prompt prefixes.
Privacy risks are validated through experiments on popular online LLM services.
Abstract
The wide deployment of Large Language Models (LLMs) has given rise to strong demands for optimizing their inference performance. Today's techniques serving this purpose primarily focus on reducing latency and improving throughput through algorithmic and hardware enhancements, while largely overlooking their privacy side effects, particularly in a multi-user environment. In our research, for the first time, we discovered a set of new timing side channels in LLM systems, arising from shared caches and GPU memory allocations, which can be exploited to infer both confidential system prompts and those issued by other users. These vulnerabilities echo security challenges observed in traditional computing systems, highlighting an urgent need to address potential information leakage in LLM serving infrastructures. In this paper, we report novel attack strategies designed to exploit such timing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScheduling and Optimization Algorithms · Petri Nets in System Modeling · Industrial Automation and Control Systems
