I Know What You Said: Unveiling Hardware Cache Side-Channels in Local Large Language Model Inference

Zibo Gao; Junjie Hu; Feng Guo; Yixin Zhang; Yinglong Han; Siyuan Liu; Haiyang Li; Zhiqiang Lv

arXiv:2505.06738·cs.CR·June 17, 2025

I Know What You Said: Unveiling Hardware Cache Side-Channels in Local Large Language Model Inference

Zibo Gao, Junjie Hu, Feng Guo, Yixin Zhang, Yinglong Han, Siyuan Liu, Haiyang Li, Zhiqiang Lv

PDF

Open Access

TL;DR

This paper uncovers hardware cache side-channel vulnerabilities in local LLM inference, revealing token values and positions, which can lead to privacy breaches through an attack framework tested on various models.

Contribution

It introduces novel cache side-channel attacks that can infer token data and positions during local LLM inference without direct interaction.

Findings

01

Attack achieves high accuracy in inferring token data.

02

Reconstructed texts have low edit distance from ground truth.

03

High cosine similarity scores indicate effective leakage.

Abstract

Large Language Models (LLMs) that can be deployed locally have recently gained popularity for privacy-sensitive tasks, with companies such as Meta, Google, and Intel playing significant roles in their development. However, the security of local LLMs through the lens of hardware cache side-channels remains unexplored. In this paper, we unveil novel side-channel vulnerabilities in local LLM inference: token value and token position leakage, which can expose both the victim's input and output text, thereby compromising user privacy. Specifically, we found that adversaries can infer the token values from the cache access patterns of the token embedding operation, and deduce the token positions from the timing of autoregressive decoding phases. To demonstrate the potential of these leaks, we design a novel eavesdropping attack framework targeting both open-source and proprietary LLM…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech Recognition and Synthesis · Natural Language Processing Techniques