S3ML: A Secure Serving System for Machine Learning Inference
Junming Ma, Chaofan Yu, Aihui Zhou, Bingzhe Wu, Xibin Wu, Xingyu Chen,, Xiangqun Chen, Lei Wang, Donggang Cao

TL;DR
S3ML is a secure, scalable system that runs machine learning inference within Intel SGX enclaves, ensuring user privacy while maintaining high performance and service quality.
Contribution
It introduces a novel SGX-aware load balancing and scaling approach, along with a secure key management service, to enhance privacy-preserving ML inference systems.
Findings
S3ML achieves high throughput and low latency in experiments
The system effectively balances load while preserving privacy
S3ML demonstrates scalability and robustness in diverse scenarios
Abstract
We present S3ML, a secure serving system for machine learning inference in this paper. S3ML runs machine learning models in Intel SGX enclaves to protect users' privacy. S3ML designs a secure key management service to construct flexible privacy-preserving server clusters and proposes novel SGX-aware load balancing and scaling methods to satisfy users' Service-Level Objectives. We have implemented S3ML based on Kubernetes as a low-overhead, high-available, and scalable system. We demonstrate the system performance and effectiveness of S3ML through extensive experiments on a series of widely-used models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSecurity and Verification in Computing · Advanced Malware Detection Techniques · Network Security and Intrusion Detection
