Adapting AlphaEvolve to Optimize Fully Homomorphic Encryption on TPUs
Shruthi Gorantala, Jianming Tong, Asra Ali, Baiyu Li, Jonathan Katz, Jeremy Kun, Thomas Steinke, Abhradeep Thakurta, Julian Walker, Amir Yazdanbakhsh

TL;DR
This paper presents AlphaEvolve, an automated system that uses evolutionary search and LLM-driven code generation to optimize Fully Homomorphic Encryption primitives on TPUs, significantly improving performance.
Contribution
It introduces a hardware-aware optimization framework for cryptographic kernels on TPUs, combining evolutionary algorithms and LLMs to automate complex low-level hardware tuning.
Findings
AlphaEvolve improved TFHE bootstrap latency by 2.5x.
CKKS rotation latency was reduced by 1.31x.
CKKS multiplication latency was reduced by 1.18x.
Abstract
The deployment of Fully Homomorphic Encryption (FHE) at scale is hindered due to its heavy computational overhead. While specialized hardware accelerators like Google Tensor Processing Units (TPUs) can help, mapping complex cryptographic kernels onto such architectures remains a challenge. Efficient execution requires co-optimization between the systolic array-based Matrix Multiplication Unit (MXU) and Vector Processing Units (VPUs), as well as the orchestration of data movement across the vector register files. Existing compiler stacks often abstract low-level hardware utilization, requiring developers to adopt a manual trial-and-error process that often results in fragmented execution and underutilized resources. To accelerate this development process, we use AlphaEvolve to automate the exploration of hardware-aware cryptographic-kernel optimizations. We frame optimization as an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
