Privacy-Preserving Training of Tree Ensembles over Continuous Data
Samuel Adams, Chaitali Choudhary, Martine De Cock, Rafael Dowsley,, David Melanson, Anderson C. A. Nascimento, Davis Railsback, Jianwei Shen

TL;DR
This paper introduces three efficient secure MPC protocols for training decision tree ensembles on continuous data, avoiding costly sorting operations and maintaining accuracy comparable to non-private methods.
Contribution
It proposes novel secure training methods for decision trees and ensembles that are more efficient than existing sorting-based approaches, suitable for continuous data.
Findings
Achieves privacy-preserving training in minutes on large datasets
Maintains classification accuracy comparable to non-private methods
Significantly reduces computational overhead compared to sorting-based protocols
Abstract
Most existing Secure Multi-Party Computation (MPC) protocols for privacy-preserving training of decision trees over distributed data assume that the features are categorical. In real-life applications, features are often numerical. The standard ``in the clear'' algorithm to grow decision trees on data with continuous values requires sorting of training examples for each feature in the quest for an optimal cut-point in the range of feature values in each node. Sorting is an expensive operation in MPC, hence finding secure protocols that avoid such an expensive step is a relevant problem in privacy-preserving machine learning. In this paper we propose three more efficient alternatives for secure training of decision tree based models on data with continuous features, namely: (1) secure discretization of the data, followed by secure training of a decision tree over the discretized data;…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCryptography and Data Security · Privacy-Preserving Technologies in Data · Complexity and Algorithms in Graphs
