On some consequences of the permutation paradigm for data anonymization: centrality of permutation matrices, universal measures of disclosure risk and information loss, evaluation by dominance
Nicolas Ruiz

TL;DR
This paper investigates the permutation paradigm in data anonymization, introducing universal measures for disclosure risk and information loss that facilitate method comparison and account for different stakeholder sensitivities.
Contribution
It establishes universal, data-independent measures of disclosure risk and information loss based on permutation matrices, enabling meaningful evaluation of anonymization methods.
Findings
Universal measures of disclosure risk and information loss are proposed.
Dominance concepts formalize differing stakeholder sensitivities.
Permutation matrices underpin the evaluation framework.
Abstract
Recently, the permutation paradigm has been proposed in data anonymization to describe any micro data masking method as permutation, paving the way for performing meaningful analytical comparisons of methods, something that is difficult currently in statistical disclosure control research. This paper explores some consequences of this paradigm by establishing some class of universal measures of disclosure risk and information loss that can be used for the evaluation and comparison of any method, under any parametrization and independently of the characteristics of the data to be anonymized. These measures lead to the introduction in data anonymization of the concepts of dominance in disclosure risk and information loss, which formalise the fact that different parties involved in micro data transaction can all have different sensitivities to privacy and information.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Internet Traffic Analysis and Secure E-voting · Cryptography and Data Security
