On the Asymptotic Capacity of Information Theoretical Privacy-preserving Epidemiological Data Collection
Jiale Cheng, Nan Liu, and Wei Kang

TL;DR
This paper investigates the fundamental limits of privacy-preserving data collection in epidemiology, proposing an optimal scheme for certain privacy constraints and establishing infeasibility in others.
Contribution
It introduces a new secure distributed computation framework for epidemiological data, providing an optimal scheme for specific privacy parameters and proving limitations for others.
Findings
Optimal download cost scheme for E < N-1
Infeasibility of scheme when E ≥ N-1
Characterization of privacy constraints in distributed data collection
Abstract
We formulate a new secure distributed computation problem, where a simulation center can require any linear combination of users' data through a caching layer consisting of servers. The users, servers, and data collector do not trust each other. For users, any data is required to be protected from up to servers; for servers, any more information than the desired linear combination cannot be leaked to the data collector; and for the data collector, any single server knows nothing about the coefficients of the linear combination. Our goal is to find the optimal download cost, which is defined as the size of message uploaded to the simulation center by the servers, to the size of desired linear combination. We proposed a scheme with the optimal download cost when . We also prove that when , the scheme is not feasible.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData-Driven Disease Surveillance · Advanced Causal Inference Techniques
