ZKBoost: Zero-Knowledge Verifiable Training for XGBoost
Nikolas Melissaris, Antigoni Polychroniadou, Akira Takahashi, Chenkai Weng, Jiayi Xu

TL;DR
ZKBoost introduces a zero-knowledge proof protocol for XGBoost training, enabling model owners to prove correct training without revealing sensitive data or model details, addressing efficiency and security challenges.
Contribution
The paper presents a generic zkPoT template for XGBoost and a VOLE-based instantiation that enhances security and reduces prover costs.
Findings
Achieves near-standard XGBoost accuracy within 1% on real datasets.
Provides a flexible zkPoT template compatible with general-purpose ZKP backends.
Overcomes previous security issues in ZKP of training with minimal costs.
Abstract
Gradient boosted decision trees, particularly XGBoost, are among the most effective methods for tabular data. As deployment in sensitive settings increases, cryptographic guarantees of model integrity become essential. We present ZKBoost, the first zero-knowledge proof of training (zkPoT) protocol for XGBoost, enabling model owners to prove correct training on a committed dataset without revealing data or model parameters. Naively re-executing XGBoost training in ZK would incur prohibitive costs, primarily due to the oblivious partitioning of training samples and unknown tree splits. Moreover, previous work on ZKP of training and inference had subtle security issues, such as leakage of tree topology and soundness gaps allowing cheating model providers to deviate from the correct execution of training and inference. We make two key contributions to address these challenges: (1) a generic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
