CVPL: A Geometric Framework for Post-Hoc Linkage Risk Assessment in Protected Tabular Data
Valery Khvatov, Alexey Neyman

TL;DR
CVPL introduces a geometric framework for post-hoc linkage risk assessment in protected tabular data, providing continuous risk estimates and diagnostics to better understand privacy vulnerabilities beyond traditional metrics.
Contribution
The paper presents CVPL, a novel geometric framework that quantifies linkage risk post-hoc, unifies classical linkage models, and offers interpretable diagnostics for privacy assessment.
Findings
Formal k-anonymity may not prevent linkability due to behavioral patterns.
CVPL's risk estimates align with classical models under certain assumptions.
Empirical validation shows substantial linkability even with formal privacy guarantees.
Abstract
Formal privacy metrics provide compliance-oriented guarantees but often fail to quantify actual linkability in released datasets. We introduce CVPL (Cluster-Vector-Projection Linkage), a geometric framework for post-hoc assessment of linkage risk between original and protected tabular data. CVPL represents linkage analysis as an operator pipeline comprising blocking, vectorization, latent projection, and similarity evaluation, yielding continuous, scenario-dependent risk estimates rather than binary compliance verdicts. We formally define CVPL under an explicit threat model and introduce threshold-aware risk surfaces, R(lambda, tau), that capture the joint effects of protection strength and attacker strictness. We establish a progressive blocking strategy with monotonicity guarantees, enabling anytime risk estimation with valid lower bounds. We demonstrate that the classical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Data Quality and Management · Access Control and Trust
