Fine-Tuning Integrity for Modern Neural Networks: Structured Drift Proofs via Norm, Rank, and Sparsity Certificates
Zhenhang Shang, Kani Chen

TL;DR
This paper introduces a cryptographic framework for certifying that fine-tuning of neural networks stays within specified structural bounds, enhancing model integrity verification.
Contribution
It proposes Succinct Model Difference Proofs (SMDPs) for zero-knowledge certification of model updates constrained by norm, rank, or sparsity, tailored for various neural architectures.
Findings
SMDPs enable structure-dependent, model-size-independent verification costs.
Concrete constructions include random projections, polynomial commitments, and streaming linear checks.
An information-theoretic lower bound shows the necessity of structure for succinct proofs.
Abstract
Fine-tuning is now the primary method for adapting large neural networks, but it also introduces new integrity risks. An untrusted party can insert backdoors, change safety behavior, or overwrite large parts of a model while claiming only small updates. Existing verification tools focus on inference correctness or full-model provenance and do not address this problem. We introduce Fine-Tuning Integrity (FTI) as a security goal for controlled model evolution. An FTI system certifies that a fine-tuned model differs from a trusted base only within a policy-defined drift class. We propose Succinct Model Difference Proofs (SMDPs) as a new cryptographic primitive for enforcing these drift constraints. SMDPs provide zero-knowledge proofs that the update to a model is norm-bounded, low-rank, or sparse. The verifier cost depends only on the structure of the drift, not on the size of the model.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
