Assessing Bayesian Nonparametric Log-Linear Models: an application to Disclosure Risk estimation
Cinzia Carota, Maurizio Filippone, Silvia Polettini

TL;DR
This paper introduces a Bayesian nonparametric approach using Dirichlet process random effects for log-linear models to improve disclosure risk estimation, effectively reducing model complexity and bias in sparse contingency tables.
Contribution
The paper proposes a two-stage Bayesian model selection method combining search and Bayesian information criteria for nonparametric log-linear models, enhancing disclosure risk estimation accuracy.
Findings
Models with good performance have simple fixed effects structures.
The method is robust against selection-induced bias.
It simplifies model selection in large, sparse tables.
Abstract
We present a method for identification of models with good predictive performances in the family of Bayesian log-linear mixed models with Dirichlet process random effects. Such a problem arises in many different applications; here we consider it in the context of disclosure risk estimation, an increasingly relevant issue raised by the increasing demand for data collected under a pledge of confidentiality. Two different criteria are proposed and jointly used via a two-stage selection procedure, in a M-open view. The first stage is devoted to identifying a path of search; then, at the second, a small number of nonparametric models is evaluated through an application-specific score based Bayesian information criterion. We test our method on a variety of contingency tables based on microdata samples from the US Census Bureau and the Italian National Security Administration, treated here as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Big Data Technologies and Applications
