MPCViT: Searching for Accurate and Efficient MPC-Friendly Vision   Transformer with Heterogeneous Attention

Wenxuan Zeng; Meng Li; Wenjie Xiong; Tong Tong; Wen-jie Lu; Jin Tan,; Runsheng Wang; Ru Huang

arXiv:2211.13955·cs.CR·August 22, 2023·1 cites

MPCViT: Searching for Accurate and Efficient MPC-Friendly Vision Transformer with Heterogeneous Attention

Wenxuan Zeng, Meng Li, Wenjie Xiong, Tong Tong, Wen-jie Lu, Jin Tan,, Runsheng Wang, Ru Huang

PDF

Open Access 1 Repo

TL;DR

This paper introduces MPCViT, an MPC-friendly Vision Transformer optimized for secure inference, reducing latency significantly while maintaining high accuracy through heterogeneous attention and neural architecture search.

Contribution

The paper proposes MPCViT, a novel MPC-optimized Vision Transformer with heterogeneous attention and a neural architecture search method for efficient secure inference.

Findings

01

MPCViT achieves up to 6.2x latency reduction with higher accuracy.

02

MPCViT+ further improves the Pareto front of accuracy and efficiency.

03

Selective Softmax linearization reduces latency without accuracy loss.

Abstract

Secure multi-party computation (MPC) enables computation directly on encrypted data and protects both data and model privacy in deep learning inference. However, existing neural network architectures, including Vision Transformers (ViTs), are not designed or optimized for MPC and incur significant latency overhead. We observe Softmax accounts for the major latency bottleneck due to a high communication complexity, but can be selectively replaced or linearized without compromising the model accuracy. Hence, in this paper, we propose an MPC-friendly ViT, dubbed MPCViT, to enable accurate yet efficient ViT inference in MPC. Based on a systematic latency and accuracy evaluation of the Softmax attention and other attention variants, we propose a heterogeneous attention optimization space. We also develop a simple yet effective MPC-aware neural architecture search algorithm for fast Pareto…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pku-sec-lab/mpcvit
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Glioma Diagnosis and Treatment · Brain Tumor Detection and Classification

MethodsAttention Is All You Need · Multi-Head Linear Attention · Dense Connections · Residual Connection · Layer Normalization · Knowledge Distillation · Linformer · Linear Layer · Softmax