Practical Secure Inference Algorithm for Fine-tuned Large Language Model   Based on Fully Homomorphic Encryption

Zhang Ruoyan; Zheng Zhongxiang; Bao Wankang

arXiv:2501.01672·cs.CR·January 8, 2025

Practical Secure Inference Algorithm for Fine-tuned Large Language Model Based on Fully Homomorphic Encryption

Zhang Ruoyan, Zheng Zhongxiang, Bao Wankang

PDF

Open Access

TL;DR

This paper presents a secure inference scheme for large language models that combines Fully Homomorphic Encryption and Parameter-Efficient Fine-Tuning, reducing privacy risks while maintaining efficiency, especially for domain-specific models like LawGPT.

Contribution

It introduces a novel method to protect private model weights using Private Linear Layers and integrates FHE for secure inference, improving privacy and efficiency in fine-tuned LLMs.

Findings

01

Inference time is 1.61s per token, demonstrating practicality.

02

The private linear layer method reduces vulnerability to model extraction attacks.

03

The scheme effectively protects user input and private model weights.

Abstract

Large language models(LLMs) are currently at the forefront of the machine learning field, which show a broad application prospect but at the same time expose some risks of privacy leakage. We combined Fully Homomorphic Encryption(FHE) and provable security theory with Parameter-Efficient Fine-Tuning(PEFT) to propose an efficient and secure inference scheme for LLMs. More specially, we focus on pre-trained LLMs which rely on open-sourced base model and then fine-tuned with the private datasets by LoRA. This is a popular road-map for Vertical Domain Models such as LawGPT and BenTsao. We use two key technologies below. Firstly, we divide the whole model into the public part and the private part. The weights of public part are publicly accessible(e.g. the open-sourced base model) while the private part needs to be protected(e.g. the LoRA matrices). In this way, the overhead brought by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMedical Research and Treatments · Privacy-Preserving Technologies in Data · Educational Reforms and Innovations