Semi-Private Computation of Data Similarity with Applications to Data   Valuation and Pricing

Ren\'e B{\o}dker Christensen; Shashi Raj Pandey; Petar Popovski

arXiv:2206.06650·cs.IT·April 12, 2023

Semi-Private Computation of Data Similarity with Applications to Data Valuation and Pricing

Ren\'e B{\o}dker Christensen, Shashi Raj Pandey, Petar Popovski

PDF

Open Access

TL;DR

This paper develops privacy-preserving multiparty computation protocols to measure data similarity via correlation, enabling data valuation and pricing without revealing sensitive data, with efficient linear complexity and error bounds.

Contribution

It introduces novel protocols for private correlation computation with controlled privacy leakage, applicable to data valuation and pricing scenarios.

Findings

01

Protocols achieve linear computational and communication complexity.

02

Exact and approximate correlation computation methods are developed.

03

Error bounds for approximate correlation are established and analyzed.

Abstract

Consider two data providers that want to contribute data to a certain learning model. Recent works have shown that the value of the data of one of the providers is dependent on the similarity with the data owned by the other provider. It would thus be beneficial if the two providers can calculate the similarity of their data, while keeping the actual data private. In this work, we devise multiparty computation-protocols to compute similarity of two data sets based on correlation, while offering controllable privacy guarantees. We consider a simple model with two participating providers and develop methods to compute exact and approximate correlation, respectively, with controlled information leakage. Both protocols have computational and communication complexities that are linear in the number of data samples. We also provide general bounds on the maximal error in the approximation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCryptography and Data Security · Privacy-Preserving Technologies in Data · Complexity and Algorithms in Graphs