C-sanitized: a privacy model for document redaction and sanitization

David Sanchez; Montserrat Batet

arXiv:1406.4285·cs.CR·July 7, 2017

C-sanitized: a privacy model for document redaction and sanitization

David Sanchez, Montserrat Batet

PDF

TL;DR

This paper introduces C-sanitized, a semantic privacy model for document redaction that automates privacy-preserving sanitization with legal compliance guarantees, reducing reliance on manual human effort.

Contribution

It proposes a novel, semantics-based privacy model for document sanitization that offers automated, legally compliant privacy guarantees, improving over existing ad-hoc methods.

Findings

01

The model effectively mimics human sanitization reasoning.

02

Empirical tests demonstrate applicability across various use cases.

03

Provides a priori privacy guarantees based on legislation.

Abstract

Within the current context of Information Societies, large amounts of information are daily exchanged and/or released. The sensitive nature of much of this information causes a serious privacy threat when documents are uncontrollably made available to untrusted third parties. In such cases, appropriate data protection measures should be undertaken by the responsible organization, especially under the umbrella of current legislations on data privacy. To do so, human experts are usually requested to redact or sanitize document contents. To relieve this burdensome task, this paper presents a privacy model for document redaction/sanitization, which offers several advantages over other models available in the literature. Based on the well-established foundations of data semantics and the information theory, our model provides a framework to develop and implement automated and inherently…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.