EncFormer: Secure and Efficient Transformer Inference over Encrypted Data
Yufan Zhu, Chao Jin, Khin Mi Mi Aung, Xiaokui Xiao

TL;DR
EncFormer is a novel framework that enables secure, efficient Transformer inference over encrypted data by optimizing FHE and MPC integration, significantly reducing communication and latency.
Contribution
It introduces Stage Compatible Patterns and a secure CKKS-MPC conversion protocol to improve efficiency and reduce communication in privacy-preserving Transformer inference.
Findings
Achieves 1.4x-30.4x lower MPC communication compared to prior systems.
Reduces end-to-end latency by up to 9.8x on GPT/BERT models.
Maintains near-plaintext accuracy on selected GLUE tasks.
Abstract
Transformer inference in machine-learning-as-a-service (MLaaS) raises privacy concerns for sensitive user inputs. Prior secure solutions that combine fully homomorphic encryption (FHE) and secure multiparty computation (MPC) are bottlenecked by inefficient FHE kernels, communication-heavy MPC protocols, and expensive FHE-MPC conversions. We present EncFormer, a two-party private Transformer inference framework that introduces Stage Compatible Patterns so that FHE kernels compose efficiently, reducing repacking and conversions. EncFormer also provides a cost analysis model built around a minimal-conversion baseline, enabling principled selection of FHE-MPC boundaries. To further reduce communication, EncFormer proposes a secure complex CKKS-MPC conversion protocol and designs communication-efficient MPC protocols for nonlinearities. With GPU optimizations, evaluations on GPT- and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
