Efficient Private Storage of Sparse Machine Learning Data

Marvin Xhemrishi; Maximilian Egger; Rawad Bitar

arXiv:2206.06676·cs.IT·June 15, 2022

Efficient Private Storage of Sparse Machine Learning Data

Marvin Xhemrishi, Maximilian Egger, Rawad Bitar

PDF

Open Access

TL;DR

This paper explores the balance between data sparsity and privacy in distributed storage systems for machine learning, proposing a coding scheme that maintains sparsity while controlling information leakage.

Contribution

It introduces a new coding scheme that relaxes previous restrictions, enabling sparse data storage with quantifiable privacy-utility trade-offs.

Findings

01

A trade-off exists between sparsity and privacy guarantees.

02

The proposed scheme effectively encodes sparse matrices with limited information leakage.

03

The approach works under non-colluding node assumptions.

Abstract

We consider the problem of maintaining sparsity in private distributed storage of confidential machine learning data. In many applications, e.g., face recognition, the data used in machine learning algorithms is represented by sparse matrices which can be stored and processed efficiently. However, mechanisms maintaining perfect information-theoretic privacy require encoding the sparse matrices into randomized dense matrices. It has been shown that, under some restrictions on the storage nodes, sparsity can be maintained at the expense of relaxing the perfect information-theoretic privacy requirement, i.e., allowing some information leakage. In this work, we lift the restrictions imposed on the storage nodes and show that there exists a trade-off between sparsity and the achievable privacy guarantees. We focus on the setting of non-colluding nodes and construct a coding scheme that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Stochastic Gradient Optimization Techniques