Privacy-preserved LLM Cascade via CoT-enhanced Policy Learning
Kai Zhang, Congchao Wang, Liqian Peng, Alec Go, Xiaozhong Liu

TL;DR
This paper introduces P3Defer, a privacy-preserving cascade framework for LLMs that enhances efficiency and privacy through CoT-enhanced policy learning, outperforming existing methods on benchmark datasets.
Contribution
The paper proposes P3Defer, a novel CoT-enhanced policy learning approach for privacy-preserved LLM cascading, addressing both performance and privacy concerns.
Findings
P3Defer improves cascade efficiency in experiments.
It effectively mitigates privacy risks.
Outperforms existing cascade methods on benchmarks.
Abstract
Large Language Models (LLMs) have gained significant attention in on-device applications due to their remarkable performance across real-world tasks. However, on-device LLMs often suffer from suboptimal performance due to hardware limitations. A promising solution to this challenge is cascading a weaker local (on-device) LLM with a more powerful server LLM. While existing research on LLM cascade primarily optimizes the performance-cost trade-off, real-world applications impose additional requirements, such as privacy preservation, which remain largely unaddressed. In this work, we move beyond existing confidence- and logit-based LLM cascade methods and propose , a novel Chain-of-Thought (CoT)-enhanced \textbf{p}olicy learning framework for \textbf{p}rivacy-\textbf{p}reserved \textbf{defer}ral decision-making. Our approach effectively improves cascade efficiency…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPlasma Diagnostics and Applications
MethodsALIGN
