Efficient Private Storage of Sparse Machine Learning Data
Marvin Xhemrishi, Maximilian Egger, Rawad Bitar

TL;DR
This paper explores the balance between data sparsity and privacy in distributed storage systems for machine learning, proposing a coding scheme that maintains sparsity while controlling information leakage.
Contribution
It introduces a new coding scheme that relaxes previous restrictions, enabling sparse data storage with quantifiable privacy-utility trade-offs.
Findings
A trade-off exists between sparsity and privacy guarantees.
The proposed scheme effectively encodes sparse matrices with limited information leakage.
The approach works under non-colluding node assumptions.
Abstract
We consider the problem of maintaining sparsity in private distributed storage of confidential machine learning data. In many applications, e.g., face recognition, the data used in machine learning algorithms is represented by sparse matrices which can be stored and processed efficiently. However, mechanisms maintaining perfect information-theoretic privacy require encoding the sparse matrices into randomized dense matrices. It has been shown that, under some restrictions on the storage nodes, sparsity can be maintained at the expense of relaxing the perfect information-theoretic privacy requirement, i.e., allowing some information leakage. In this work, we lift the restrictions imposed on the storage nodes and show that there exists a trade-off between sparsity and the achievable privacy guarantees. We focus on the setting of non-colluding nodes and construct a coding scheme that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Stochastic Gradient Optimization Techniques
