TL;DR
ScaleOT is a scalable offsite-tuning framework that uses reinforcement learning and layer compression to balance privacy and utility in large language models, outperforming existing methods in privacy protection with minimal utility loss.
Contribution
It introduces a novel layerwise compression and harmonizer replacement approach, enabling privacy-preserving offsite tuning with improved scalability and performance.
Findings
Achieves near lossless tuning performance compared to full fine-tuning.
Provides enhanced privacy protection through rank reduction and layer importance-based compression.
Maintains utility with negligible impact despite significant compression.
Abstract
Offsite-tuning is a privacy-preserving method for tuning large language models (LLMs) by sharing a lossy compressed emulator from the LLM owners with data owners for downstream task tuning. This approach protects the privacy of both the model and data owners. However, current offsite tuning methods often suffer from adaptation degradation, high computational costs, and limited protection strength due to uniformly dropping LLM layers or relying on expensive knowledge distillation. To address these issues, we propose ScaleOT, a novel privacy-utility-scalable offsite-tuning framework that effectively balances privacy and utility. ScaleOT introduces a novel layerwise lossy compression algorithm that uses reinforcement learning to obtain the importance of each layer. It employs lightweight networks, termed harmonizers, to replace the raw LLM layers. By combining important original LLM layers…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
