Analysing symbolic data by pseudo-marginal methods

Yu Yang; Matias Quiroz; Boris Beranger; Robert Kohn; Scott A. Sisson

arXiv:2408.04419·stat.ME·April 2, 2026

Analysing symbolic data by pseudo-marginal methods

Yu Yang, Matias Quiroz, Boris Beranger, Robert Kohn, Scott A. Sisson

PDF

TL;DR

This paper introduces a Bayesian pseudo-marginal approach for symbolic data analysis, addressing intractable likelihood integrals and bias issues, with methods validated through simulations and real data.

Contribution

It develops novel pseudo-marginal MCMC methods for likelihood-based SDA, improving computational efficiency and reducing bias in parameter estimation.

Findings

01

Significant reduction in computation time compared to full-data analysis.

02

Small loss of information with approximate methods.

03

Effective handling of intractable likelihood integrals.

Abstract

Symbolic data analysis (SDA) aggregates large individual-level datasets into a small number of distributional summaries, such as random rectangles or random histograms. The inference is carried out using these summaries in place of the original dataset, resulting in computational gains at the loss of some information. In likelihood-based SDA, the likelihood function is characterised by an integral with a large exponent, which limits the method's utility as for typical models the integral is unavailable in closed form. In addition, the likelihood function is known to produce biased parameter estimates in some circumstances. Our article develops a Bayesian framework for SDA methods in these settings that resolves the issues resulting from integral intractability and biased parameter estimation using pseudo-marginal Markov chain Monte Carlo methods. We develop an exact but computationally…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.