vTune: Verifiable Fine-Tuning for LLMs Through Backdooring
Eva Zhang, Arka Pal, Akilesh Potti, Micah Goldblum

TL;DR
vTune is a verification method for fine-tuned large language models that uses backdoor data points to statistically confirm proper fine-tuning, scalable to state-of-the-art models and resistant to attacks.
Contribution
This paper introduces vTune, a scalable and robust verification technique for LLM fine-tuning using backdoor data points and statistical testing.
Findings
Statistical test with p-values around 10^{-40} confirms fine-tuning.
No negative impact on downstream task performance.
Robustness demonstrated against various attack attempts.
Abstract
As fine-tuning large language models (LLMs) becomes increasingly prevalent, users often rely on third-party services with limited visibility into their fine-tuning processes. This lack of transparency raises the question: how do consumers verify that fine-tuning services are performed correctly? For instance, a service provider could claim to fine-tune a model for each user, yet simply send all users back the same base model. To address this issue, we propose vTune, a simple method that uses a small number of backdoor data points added to the training data to provide a statistical test for verifying that a provider fine-tuned a custom model on a particular user's dataset. Unlike existing works, vTune is able to scale to verification of fine-tuning on state-of-the-art LLMs, and can be used both with open-source and closed-source models. We test our approach across several model families…
Peer Reviews
Decision·ICLR 2025 Conference Withdrawn Submission
- The paper's writing is clear.
- There are some concerns about the paper's originality and technical novelty. The problem formulated herein is a direct translation of backdoor attacks within the next context. The specific desiderata listed in Section 2.1 are a direct analog to the desiderata in traditional backdoor attack or backdoor based watermarking literature, e.g., https://arxiv.org/pdf/2003.04247. The paper would benefit from deeper analysis of the unique challenges in this new problem and how the proposed method is des
1. This paper is well-motivated and the problem it aims to address (i.e., verifying whether a service provider fine-tuned a custom model on a downstream dataset provided by users.) is very practical. 2. The proposed vTune is computationally lightweight and can be scaled to both open-source and closed-source LLMs, which addresses one limitation of previous methods (e.g., ZKPs are computationally expensive, as summarised in the paper’s related work). 3. The application of backdoor attacks for ver
1. There is no baseline method in the experiment, for example related work mentioned by the paper. vTune should be compared with existing methods (i.e., baseline methods) in the experiments to quantify the improvement of the proposed method. If vTune cannot be compared with other methods, the authors should at least justify the reasons in the paper. 2. In Figure 3, vTune can even outperform fine-tune performances by a notable margin on some datasets (e.g., Gemma 2B on SQ, X, MQ), which is count
The article has a clear structure and a novel approach.
1. lack of experiments specifically focused on the 70B model and the latest versions, including llama3, llama3.1, llama3.2, and GPT-4. 2. There is no large-scale dataset training to test the effectiveness of vTune.
1. Tackling an interesting question. 2. Evaluating different models (open-source and closed-source) and datasets
Thanks for submitting this paper to ICLR. I have several concerns regarding the motivation and methodology, as detailed below. 1. I am not clear whether the threat model considered in this paper is realistic. Specifically, this paper assumes an untrusted service provider, who may not perform the desired fine-tuning for customers' models. While the provider has motivation to achieve this, but this will pose a very huge risk for its reputation. There is no concrete evidence that any service prov
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSimulation Techniques and Applications · VLSI and Analog Circuit Testing · Real-time simulation and control systems
Methodstravel james · Balanced Selection
