Private Transformer Inference in MLaaS: A Survey

Yang Li; Xinyu Zhou; Yitong Wang; Liangxin Qian; Jun Zhao

arXiv:2505.10315·cs.CR·May 16, 2025

Private Transformer Inference in MLaaS: A Survey

Yang Li, Xinyu Zhou, Yitong Wang, Liangxin Qian, Jun Zhao

PDF

Open Access

TL;DR

This survey reviews recent advances in Private Transformer Inference (PTI) techniques that enable privacy-preserving AI model deployment in MLaaS, emphasizing cryptographic methods, challenges, and evaluation frameworks.

Contribution

It introduces a structured taxonomy and evaluation framework for PTI, highlighting recent solutions and addressing the balance between privacy, efficiency, and high-performance inference.

Findings

01

Overview of cryptographic techniques like secure multi-party computation and homomorphic encryption.

02

Identification of key challenges in resource efficiency and privacy trade-offs.

03

Proposed evaluation framework for PTI solutions.

Abstract

Transformer models have revolutionized AI, powering applications like content generation and sentiment analysis. However, their deployment in Machine Learning as a Service (MLaaS) raises significant privacy concerns, primarily due to the centralized processing of sensitive user data. Private Transformer Inference (PTI) offers a solution by utilizing cryptographic techniques such as secure multi-party computation and homomorphic encryption, enabling inference while preserving both user data and model privacy. This paper reviews recent PTI advancements, highlighting state-of-the-art solutions and challenges. We also introduce a structured taxonomy and evaluation framework for PTI, focusing on balancing resource efficiency with privacy and bridging the gap between high-performance inference and data privacy.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCryptography and Data Security · Library Science and Information Systems

Methodstravel james · Attention Is All You Need · Linear Layer · Multi-Head Attention · Dense Connections · Dropout · Layer Normalization · Byte Pair Encoding · Softmax · Absolute Position Encodings