Privacy-Preserving Inference for Quantized BERT Models
Tianpei Lu, Bingsheng Zhang, Lekun Peng, Bowen Zheng, Lichun Li, Kui Ren

TL;DR
This paper introduces a privacy-preserving inference method for quantized BERT models that significantly reduces computational overhead using layer-wise quantization, secure lookup tables, and dual secret sharing, enabling efficient secure NLP inference.
Contribution
It proposes a novel layer-wise quantization scheme, supports 1-bit weights in secure inference, and designs efficient protocols for nonlinear functions like softmax, improving speed and privacy in secure BERT inference.
Findings
Achieves up to 22x speedup over previous methods.
Supports 1-bit weight fully connected layers securely.
Reduces overhead in privacy-preserving BERT inference.
Abstract
With the increasing deployment of generative machine learning models in privacy-sensitive domains such as healthcare and personalized services, ensuring secure inference has become a critical challenge. Secure multi-party computation (MPC) enables privacy-preserving model inference but suffers from high communication and computation overhead. The main bottleneck lies in the expensive secure evaluation of floating-point operations. Quantization offers a promising solution by converting floating-point operations into lower-precision integer computations, significantly reducing overhead. However, existing MPC-based quantized inference methods either rely on public quantization parameters-posing privacy risks-or suffer from inefficiencies, particularly in handling nonlinear functions such as activations and softmax. In this work, we propose a fine-grained, layer-wise quantization scheme and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCryptography and Data Security · Privacy-Preserving Technologies in Data · Cryptography and Residue Arithmetic
