Privacy-Aware Data Cleaning-as-a-Service (Extended Version)
Yu Huang, Mostafa Milani, Fei Chiang

TL;DR
This paper introduces PACAS, a privacy-aware data cleaning model in a cloud setting that balances data utility and sensitivity protection through innovative data repairs, semantic distance measures, and an extended anonymity framework.
Contribution
The paper presents PACAS, a novel privacy-aware data cleaning framework with generalized data repairs, a semantic utility measure, and an extended anonymity model for sensitive data sharing.
Findings
PACAS effectively safeguards sensitive data while maintaining data utility.
It achieves lower repair errors compared to existing privacy-aware cleaning methods.
The model supports semantic considerations in data privacy and cleaning processes.
Abstract
Data cleaning is a pervasive problem for organizations as they try to reap value from their data. Recent advances in networking and cloud computing technology have fueled a new computing paradigm called Database-as-a-Service, where data management tasks are outsourced to large service providers. In this paper, we consider a Data Cleaning-as-a-Service model that allows a client to interact with a data cleaning provider who hosts curated, and sensitive data. We present PACAS: a Privacy-Aware data Cleaning-As-a-Service model that facilitates interaction between the parties with client query requests for data, and a service provider using a data pricing scheme that computes prices according to data sensitivity. We propose new extensions to the model to define generalized data repairs that obfuscate sensitive data to allow data sharing between the client and service provider. We present a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
