IDCloak: A Practical Secure Multi-party Dataset Join Framework for Vertical Privacy-preserving Machine Learning
Shuyu Chen, Guopeng Lin, Haoyu Niu, Lushan Song, Chengxun Hong, Weili Han

TL;DR
IDCloak introduces a practical multi-party framework for secure dataset joining in vertical privacy-preserving machine learning, enhancing security and efficiency without relying on a non-colluding server.
Contribution
It presents the first secure multi-party dataset join framework for vPPML that maintains ID privacy without a non-colluding auxiliary server, combining optimized protocols for better security and performance.
Findings
Outperforms state-of-the-art two-party join frameworks in efficiency.
Provides stronger security guarantees under dishonest majority.
Significantly improves communication and computation efficiency in secure shuffle protocol.
Abstract
Vertical privacy-preserving machine learning (vPPML) enables multiple parties to train models on their vertically distributed datasets while keeping datasets private. In vPPML, it is critical to perform the secure dataset join, which aligns features corresponding to intersection IDs across datasets and forms a secret-shared and joint training dataset. However, existing methods for this step could be impractical due to: (1) they are insecure when they expose intersection IDs; or (2) they rely on a strong trust assumption requiring a non-colluding auxiliary server; or (3) they are limited to the two-party setting. This paper proposes IDCloak, the first practical secure multi-party dataset join framework for vPPML that keeps IDs private without a non-colluding auxiliary server. IDCloak consists of two protocols: (1) a circuit-based multi-party private set intersection protocol (cmPSI),…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Brain Tumor Detection and Classification
