CryptoGen: Secure Transformer Generation with Encrypted KV-Cache Reuse

Hedong Zhang; Neusha Javidnia; Shweta Pardeshi; Qian Lou; Farinaz Koushanfar

arXiv:2602.08798·cs.CR·February 16, 2026

CryptoGen: Secure Transformer Generation with Encrypted KV-Cache Reuse

Hedong Zhang, Neusha Javidnia, Shweta Pardeshi, Qian Lou, Farinaz Koushanfar

PDF

Open Access

TL;DR

CryptoGen introduces a scalable, privacy-preserving system for autoregressive Transformer generation that efficiently reuses encrypted KV-caches, significantly reducing latency while maintaining security in untrusted environments.

Contribution

It is the first system to enable scalable, privacy-preserving neural generation with encrypted KV-cache reuse, combining homomorphic encryption and secret sharing techniques.

Findings

01

Achieves 4.4x-7.6x lower per-token latency than existing systems.

02

Maintains near-linear latency and memory scaling for input lengths up to 512 tokens.

03

Demonstrates effectiveness on models trained on WikiText-2, PTB, and LAMBADA datasets.

Abstract

The widespread deployment of cloud-hosted generative models raises a fundamental challenge: enabling efficient autoregressive generation while preserving the privacy of both user prompts and model parameters in untrusted environments. We address this challenge in a client-server setting where an untrusted server hosts an autoregressive Transformer and the client requires cryptographic protection for both inputs and inference. We present CryptoGen, the first system to enable scalable privacy-preserving neural generation with persistent encrypted key-value (KV) cache reuse. Discriminative-task secure inference systems incur quadratic latency and memory growth when adapted to autoregressive decoding due to the lack of native encrypted KV-cache support. In contrast, CryptoGen achieves near-linear scaling by securely reusing and updating encrypted KV caches throughout generation. CryptoGen…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCryptography and Data Security · Chaos-based Image/Signal Encryption · Adversarial Robustness in Machine Learning