Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption
Stephen Hardy, Wilko Henecka, Hamish Ivey-Law, Richard Nock, Giorgio, Patrini, Guillaume Smith, Brian Thorne

TL;DR
This paper presents a privacy-preserving federated learning framework for vertically partitioned data, combining entity resolution and homomorphic encryption, with formal analysis of the impact of entity resolution errors on model performance.
Contribution
It introduces a novel three-party system for secure federated linear modeling with formal analysis of entity resolution errors' effects.
Findings
System achieves accuracy comparable to non-private solutions
Scales efficiently to millions of entities and hundreds of features
Entity resolution errors have quantifiable impacts on model quality
Abstract
Consider two data providers, each maintaining private records of different feature sets about common entities. They aim to learn a linear model jointly in a federated setting, namely, data is local and a shared model is trained from locally computed updates. In contrast with most work on distributed learning, in this scenario (i) data is split vertically, i.e. by features, (ii) only one data provider knows the target variable and (iii) entities are not linked across the data providers. Hence, to the challenge of private learning, we add the potentially negative consequences of mistakes in entity resolution. Our contribution is twofold. First, we describe a three-party end-to-end solution in two phases ---privacy-preserving entity resolution and federated logistic regression over messages encrypted with an additively homomorphic scheme---, secure against a honest-but-curious adversary.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Stochastic Gradient Optimization Techniques
