Time Will Tell: Timing Side Channels via Output Token Count in Large Language Models
Tianchen Zhang, Gururaj Saileshwar, David Lie

TL;DR
This paper uncovers a timing side-channel in large language models that leaks sensitive input information through output token count, demonstrating practical attacks on translation and classification tasks and proposing mitigations.
Contribution
It introduces a novel timing side-channel based on output token count in LLMs and demonstrates its effectiveness in extracting sensitive input data.
Findings
Over 75% accuracy in identifying target language in translation tasks.
More than 70% accuracy in leaking input class in classification tasks.
Effective mitigations are proposed against the token count side-channel.
Abstract
This paper demonstrates a new side-channel that enables an adversary to extract sensitive information about inference inputs in large language models (LLMs) based on the number of output tokens in the LLM response. We construct attacks using this side-channel in two common LLM tasks: recovering the target language in machine translation tasks and recovering the output class in classification tasks. In addition, due to the auto-regressive generation mechanism in LLMs, an adversary can recover the output token count reliably using a timing channel, even over the network against a popular closed-source commercial LLM. Our experiments show that an adversary can learn the output language in translation tasks with more than 75% precision across three different models (Tower, M2M100, MBart50). Using this side-channel, we also show the input class in text classification tasks can be leaked out…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Topic Modeling
