A Middle Path for On-Premises LLM Deployment: Preserving Privacy Without Sacrificing Model Confidentiality

Hanbo Huang; Yihan Li; Bowen Jiang; Bo Jiang; Lin Liu; Ruoyu Sun; Zhuotao Liu; Shiyu Liang

arXiv:2410.11182·cs.LG·October 8, 2025

A Middle Path for On-Premises LLM Deployment: Preserving Privacy Without Sacrificing Model Confidentiality

Hanbo Huang, Yihan Li, Bowen Jiang, Bo Jiang, Lin Liu, Ruoyu Sun, Zhuotao Liu, Shiyu Liang

PDF

Open Access 1 Repo

TL;DR

This paper introduces SOLID, a framework for on-premises LLM deployment that balances privacy and model confidentiality by securing bottom layers, defending against distillation attacks while maintaining customization capabilities.

Contribution

The paper reveals that securing bottom layers offers stronger protection against attacks and proposes SOLID, a method to optimize the number of secured layers for privacy and customization balance.

Findings

01

Securing bottom layers provides better protection than top layers.

02

SOLID outperforms baselines in balancing security and customization.

03

Extensive experiments on models from 1.3B to 70B parameters validate effectiveness.

Abstract

Privacy-sensitive users require deploying large language models (LLMs) within their own infrastructure (on-premises) to safeguard private data and enable customization. However, vulnerabilities in local environments can lead to unauthorized access and potential model theft. To address this, prior research on small models has explored securing only the output layer within hardware-secured devices to balance model confidentiality and customization. Yet this approach fails to protect LLMs effectively. In this paper, we discover that (1) query-based distillation attacks targeting the secured top layer can produce a functionally equivalent replica of the victim model; (2) securing the same number of layers, bottom layers before a transition layer provide stronger protection against distillation attacks than top layers, with comparable effects on customization performance; and (3) the number…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

OTTO-OTO/SCARA-Semi-Open
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Surface Polishing Techniques