A Privacy-Preserving Data Collection Method for Diversified Statistical Analysis
Hao Jiang, Quan Zhou, Dongdong Zhao, Shangshang Yang, Wenjian Luo, and Xingyi Zhang

TL;DR
This paper introduces RVNS, a novel real-value negative survey model that enables privacy-preserving collection of sensitive data distributions without discretization, enhancing data utility for diverse statistical analyses.
Contribution
The paper proposes the first real-value negative survey model, RVNS, which preserves privacy and accurately captures sensitive data distributions without requiring data discretization.
Findings
RVNS conforms to differential privacy standards.
Experimental results validate the method's effectiveness on synthetic and real data.
RVNS accurately estimates sensitive data distributions while protecting privacy.
Abstract
Data perturbation-based privacy-preserving methods have been widely adopted in various scenarios due to their efficiency and the elimination of the need for a trusted third party. However, these methods primarily focus on individual statistical indicators, neglecting the overall quality of the collected data from a distributional perspective. Consequently, they often fall short of meeting the diverse statistical analysis requirements encountered in practical data analysis. As a promising sensitive data perturbation method, negative survey methods is able to complete the task of collecting sensitive information distribution while protecting personal privacy. Yet, existing negative survey methods are primarily designed for discrete sensitive information and are inadequate for real-valued data distributions. To bridge this gap, this paper proposes a novel real-value negative survey model,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data
