Privacy-Preserving LLMs Routing

Xidong Wu; Yukuan Zhang; Yuqiong Ji; Reza Shirkavand; Qian Lou; Shangqian Gao

arXiv:2604.15728·cs.CR·April 20, 2026

Privacy-Preserving LLMs Routing

Xidong Wu, Yukuan Zhang, Yuqiong Ji, Reza Shirkavand, Qian Lou, Shangqian Gao

PDF

TL;DR

PPRoute is a framework that enables privacy-preserving large language model routing using cryptographic techniques, achieving high performance and low latency while maintaining routing quality.

Contribution

It introduces novel MPC-friendly operations, a multi-step training algorithm, and an efficient Top-k algorithm for secure sorting in LLM routing.

Findings

01

Achieves plaintext-level routing performance under MPC.

02

Provides approximately 20× speedup over naive MPC implementations.

03

Reduces communication latency with an $O(1)$ complexity Top-k algorithm.

Abstract

Large language model (LLM) routing has emerged as a critical strategy to balance model performance and cost-efficiency by dynamically selecting services from various model providers. However, LLM routing adds an intermediate layer between users and LLMs, creating new privacy risks to user data. These privacy risks have not been systematically studied. Although cryptographic techniques such as Secure Multi-Party Computation (MPC) enable privacy-preserving computation, their protocol design and implementation remain under-explored, and na\"ive implementations typically incur prohibitive computational overhead. To address this, we propose a privacy-preserving LLM routing framework (PPRoute). PPRoute includes multiple strategies to speed up encoder inference and nearest neighbor search under the MPC and maintain the quality of LLM routing. First, PPRoute uses MPC-friendly operations to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.