Almost-Free Queue Jumping for Prior Inputs in Private Neural Inference
Qiao Zhang, Minghui Xu, Tingchuang Zhang, and Xiuzhen Cheng

TL;DR
This paper introduces PrivQJ, a framework that enables efficient priority queue jumping in privacy-preserving neural inference, significantly reducing overhead and latency for urgent requests without compromising system performance.
Contribution
PrivQJ is the first approach to enable privacy-preserving queue jumping in batched neural inference with minimal cryptographic overhead.
Findings
Over an order-of-magnitude reduction in overhead compared to existing systems.
Effective prioritization of urgent requests without degrading overall system throughput.
Theoretical and experimental validation of PrivQJ's efficiency.
Abstract
Privacy-Preserving Machine Learning as a Service (PP-MLaaS) enables secure neural network inference by integrating cryptographic primitives such as homomorphic encryption (HE) and multi-party computation (MPC), protecting both client data and server models. Recent mixed-primitive frameworks have significantly improved inference efficiency, yet they process batched inputs sequentially, offering little flexibility for prioritizing urgent requests. Na\"ive queue jumping introduces considerable computational and communication overhead, increasing non-negligible latency for in-queue inputs. We initiate the study of privacy-preserving queue jumping in batched inference and propose PrivQJ, a novel framework that enables efficient priority handling without degrading overall system performance. PrivQJ exploits shared computation across inputs via in-processing slot recycling, allowing prior…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCryptography and Data Security · Privacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques
