Sherpa.ai Privacy-Preserving Multi-Party Entity Alignment without Intersection Disclosure for Noisy Identifiers
Daniel M. Jimenez-Gutierrez, Dario Pighin, Enrique Zuazua, Georgios Kellaris, Joaquin Del Rio, Oleksii Sliusarenko, Xabi Uribe-Etxebarria

TL;DR
This paper introduces Sherpa.ai's multi-party privacy-preserving union protocol for vertical federated learning, enabling noisy and exact entity alignment across multiple parties without revealing shared sample intersections.
Contribution
It extends existing two-party protocols to multiple parties, supporting typo-tolerant matching and low communication overhead, with formal correctness and privacy guarantees.
Findings
Proposed a scalable multi-party union protocol for PPEA.
Supported both exact and noisy matching with privacy preservation.
Analyzed communication and computational complexity of the protocol.
Abstract
Federated Learning (FL) enables collaborative model training among multiple parties without centralizing raw data. There are two main paradigms in FL: Horizontal FL (HFL), where all participants share the same feature space but hold different samples, and Vertical FL (VFL), where parties possess complementary features for the same set of samples. A prerequisite for VFL training is privacy-preserving entity alignment (PPEA), which establishes a common index of samples across parties (alignment) without revealing which samples are shared between them. Conventional private set intersection (PSI) achieves alignment but leaks intersection membership, exposing sensitive relationships between datasets. The standard private set union (PSU) mitigates this risk by aligning on the union of identifiers rather than the intersection. However, existing approaches are often limited to two parties or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
