Large-Scale Secure XGB for Vertical Federated Learning

Wenjing Fang; Derun Zhao; Jin Tan; Chaochao Chen; Chaofan Yu; Li Wang,; Lei Wang; Jun Zhou; Benyu Zhang

arXiv:2005.08479·cs.LG·September 3, 2021

Large-Scale Secure XGB for Vertical Federated Learning

Wenjing Fang, Derun Zhao, Jin Tan, Chaochao Chen, Chaofan Yu, Li Wang,, Lei Wang, Jun Zhou, Benyu Zhang

PDF

TL;DR

This paper introduces a large-scale, privacy-preserving XGB model for vertical federated learning, employing secure multi-party computation, distributed model storage, and efficient protocols to ensure data privacy and scalability.

Contribution

It presents the first secure, scalable XGB framework for vertical federated learning, combining novel algorithms and protocols for privacy and efficiency.

Findings

01

Achieves competitive accuracy on public and real-world datasets.

02

Ensures data privacy through multi-party computation and distributed storage.

03

Improves training efficiency with secure permutation protocols.

Abstract

Privacy-preserving machine learning has drawn increasingly attention recently, especially with kinds of privacy regulations come into force. Under such situation, Federated Learning (FL) appears to facilitate privacy-preserving joint modeling among multiple parties. Although many federated algorithms have been extensively studied, there is still a lack of secure and practical gradient tree boosting models (e.g., XGB) in literature. In this paper, we aim to build large-scale secure XGB under vertically federated learning setting. We guarantee data privacy from three aspects. Specifically, (i) we employ secure multi-party computation techniques to avoid leaking intermediate information during training, (ii) we store the output model in a distributed manner in order to minimize information release, and (iii) we provide a novel algorithm for secure XGB predict with the distributed model.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.