Data Guard: A Fine-grained Purpose-based Access Control System for Large Data Warehouses
Khai Tran, Sudarshan Vasudevan, Pratham Desai, Alex Gorelik, Mayank Ahuja, Athrey Yadatore Venkateshababu, Mohit Verma, Dichao Hu, Walaa Eldin Moustafa, Vasanth Rajamani, Ankit Gupta, Issac Buenrostro, Kalinda Raina

TL;DR
Data Guard is a purpose-based access control system for large data warehouses that enforces fine-grained, semantic policies to mask data at various granularities, ensuring compliance while maintaining data utility.
Contribution
It introduces a novel, fine-grained, purpose-based access control system that translates policies into SQL views for data masking in large data warehouses.
Findings
Efficient implementation with minimal performance overhead.
Deployed in LinkedIn's production environment for over 20,000 daily accesses.
Supports masking at row, column, and sub-cell levels for complex data types.
Abstract
The last few years have witnessed a spate of data protection regulations in conjunction with an ever-growing appetite for data usage in large businesses, which presents significant challenges for businesses to maintain compliance. To address this conflict, we present Data Guard - a fine-grained, purpose-based access control system for large data warehouses. Data Guard enables authoring policies based on semantic descriptions of data and purpose of data access. Data Guard then translates these policies into SQL views that mask data from the underlying warehouse tables. At access time, Data Guard ensures compliance by transparently routing each table access to the appropriate data-masking view based on the purpose of the access, thus minimizing the effort of adopting Data Guard in existing applications. Our enforcement solution allows masking data at much finer granularities than what…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAccess Control and Trust · Privacy-Preserving Technologies in Data · Cloud Data Security Solutions
