TL;DR
SecP-Tuning introduces an efficient MPC-based framework for privacy-preserving prompt tuning of large language models, significantly reducing computational and communication costs while maintaining competitive performance.
Contribution
It is the first to enable efficient privacy-preserving prompt tuning of LLMs using MPC, with innovative forward-only tuning and a novel privacy-preserving self-attention mechanism.
Findings
Achieves 12x faster end-to-end training compared to full fine-tuning.
Reduces communication overhead by up to 20x.
Maintains comparable task performance to gradient-based methods.
Abstract
Large Language Models (LLMs) have revolutionized numerous fields, yet their adaptation to specialized tasks in privacy-sensitive domains such as healthcare and finance remains constrained due to the scarcity of accessible training data caused by stringent privacy requirements. Secure Multi-party Computation (MPC)-based privacy-preserving machine learning provides theoretical guarantees for the privacy of model parameters and data. However, its application to LLMs has been predominantly limited to inference, as fine-tuning introduces significant efficiency challenges, particularly in backward propagation, optimizer, and self-attention operations. To address these challenges, we propose SecP-Tuning, the first MPC-based framework designed for efficient, privacy-preserving prompt tuning of LLMs. SecP-Tuning innovatively integrates Forward-only Tuning (FoT) through the ``data owner-server…
Peer Reviews
Decision·ICLR 2026 Poster
1. Through detailed empirical analysis (see Figure 1, Page 2) and system profiling, the authors convincingly highlight that backward propagation and softmax-based self-attention present severe efficiency barriers in MPC-based LLM fine-tuning. 2. SecP-Tuning demonstrates significant improvements in speed and communication overhead. 3. The framework operates under a black-box paradigm. This allows a data owner to perform tuning without the model developer ever receiving the updated parameters. T
1. The optimizer used in the paper is Adam, which is different from the most popular optimizer, AdamW, in the LLM domain. Can the authors explain the reason? Besides, how do the authors choose the hyperparameters for the baselines? Does the few-shot learning with 1000 epochs cause overfitting? 2. The empirical validation is performed on small-scale settings. Though it is also a valuable setting, including large-scale settings could make the reader better understand the performance of SecP-Tunin
The paper is well-written with a clear introduction of the problem and the room for their contributions. Despite various solutions in private LLM adaptation, Secure MPC is still highly desirable due to its formal guarantees. The authors clearly identify the gaps in the communication and computational overheads in Secure MPC. The figures clearly explain the problem and proposed solutions to non-subject-matter experts.
1. The experiments are conducted exclusively on ROBERTA, an encoder-only LLM. In 2025, the field is heavily focused on autoregressive LLMs. It is unclear if the method, and particularly its strong accuracy results, will translate to these architectures. However, I do recognize that ROBERTA is still a strong model for many downstream NLP tasks. Being able to effectively perform Secure MPC is still a significant contribution. 2. The paper omits a comparison to non-secure, plaintext baselines in i
1. It addresses the high computational cost of attention during fine-tuning, significantly accelerating the overall training process. 2. 2-OUT-OF-2 module appears intriguing, as it separates and isolates the model’s information-capturing capability. 3. It adopts gradient-free optimization, improving the efficiency of forward computation.
1. The paper devotes a large portion of the Preliminary section to background explanations, while the introduction of its own method is relatively unclear, making it difficult to follow during reading. 2. Optimizing LLMs through gradient-free black-box methods is an outdated approach, and applying it to prompt-tuning tasks is not particularly novel. Its applicability within LLM scenarios remains quite limited. It also lacks horizontal comparisons with several key related works, such as the foll
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
