AutoReP: Automatic ReLU Replacement for Fast Private Network Inference
Hongwu Peng, Shaoyi Huang, Tong Zhou, Yukui Luo, Chenghong Wang,, Zigeng Wang, Jiahui Zhao, Xi Xie, Ang Li, Tony Geng, Kaleel Mahmood, Wujie, Wen, Xiaolin Xu, Caiwen Ding

TL;DR
AutoReP is a gradient-based method that automates ReLU and polynomial function selection to significantly improve the efficiency and accuracy of private neural network inference, reducing ReLU operations while maintaining model performance.
Contribution
It introduces AutoReP, a novel automated approach with distribution-aware polynomial approximation to optimize private inference by reducing non-linear operators without accuracy loss.
Findings
Achieved up to 9.45% accuracy improvement over state-of-the-art methods.
Reduced ReLU operations by up to 176.1 times on ImageNet with maintained accuracy.
Demonstrated effectiveness on multiple datasets including CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet.
Abstract
The growth of the Machine-Learning-As-A-Service (MLaaS) market has highlighted clients' data privacy and security issues. Private inference (PI) techniques using cryptographic primitives offer a solution but often have high computation and communication costs, particularly with non-linear operators like ReLU. Many attempts to reduce ReLU operations exist, but they may need heuristic threshold selection or cause substantial accuracy loss. This work introduces AutoReP, a gradient-based approach to lessen non-linear operators and alleviate these issues. It automates the selection of ReLU and polynomial functions to speed up PI applications and introduces distribution-aware polynomial approximation (DaPa) to maintain model expressivity while accurately approximating ReLUs. Our experimental results demonstrate significant accuracy improvements of 6.12% (94.31%, 12.9K ReLU budget, CIFAR-10),…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCryptography and Data Security · Stochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
