SecFormer: Fast and Accurate Privacy-Preserving Inference for Transformer Models via SMPC
Jinglong Luo, Yehong Zhang, Zhuo Zhang, Jiaqi Zhang, Xin Mu, Hui Wang, Yue Yu, Zenglin Xu

TL;DR
SecFormer is a novel framework that enables fast, accurate, and privacy-preserving inference for Transformer models using SMPC by optimizing nonlinear operations and eliminating costly functions.
Contribution
It introduces a comprehensive PPI framework that removes high-cost operations and develops efficient protocols, significantly improving speed and accuracy over prior methods.
Findings
SecFormer outperforms MPCFormer with 3.4-24.7% accuracy improvements.
SecFormer is 3.57-3.58 times faster than PUMA.
The framework effectively optimizes nonlinear functions like GeLU, LayerNorm, and Softmax.
Abstract
With the growing use of Transformer models hosted on cloud platforms to offer inference services, privacy concerns are escalating, especially concerning sensitive data like investment plans and bank account details. Secure Multi-Party Computing (SMPC) emerges as a promising solution to protect the privacy of inference data and model parameters. However, the application of SMPC in Privacy-Preserving Inference (PPI) for Transformer models often leads to considerable slowdowns or declines in performance. This is largely due to the multitude of nonlinear operations in the Transformer architecture, which are not well-suited to SMPC and difficult to circumvent or optimize effectively. To address this concern, we introduce a comprehensive PPI framework called SecFormer to achieve fast and accurate PPI for Transformer models. We successfully eliminate the high-cost exponential and maximum…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFerroelectric and Negative Capacitance Devices · Privacy-Preserving Technologies in Data · Topic Modeling
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Label Smoothing · Adam · Dropout · Absolute Position Encodings · Layer Normalization · Residual Connection
