Statistically efficient thinning of a Markov chain sampler
Art B. Owen

TL;DR
This paper investigates when thinning Markov chain Monte Carlo outputs can improve statistical efficiency by reducing computation costs, especially for autocorrelated samples with slow decay, providing formulas for optimal thinning strategies.
Contribution
It introduces a framework showing thinning can be beneficial when computation costs are high and autocorrelations decay slowly, with explicit formulas for optimal thinning in AR(1) processes.
Findings
Thinning improves efficiency if cost per sample is large and autocorrelations decay slowly.
Optimal thinning frequency grows rapidly as autocorrelation approaches 1.
Thinning is not beneficial when autocorrelation is non-positive or cost is low.
Abstract
It is common to subsample Markov chain output to reduce the storage burden. Geyer (1992) shows that discarding out of every observations will not improve statistical efficiency, as quantified through variance in a given computational budget. That observation is often taken to mean that thinning MCMC output cannot improve statistical efficiency. Here we suppose that it costs one unit of time to advance a Markov chain and then units of time to compute a sampled quantity of interest. For a thinned process, that cost is incurred less often, so it can be advanced through more stages. Here we provide examples to show that thinning will improve statistical efficiency if is large and the sample autocorrelations decay slowly enough. If the lag autocorrelations of a scalar measurement satisfy , then there is always a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIndustrial Vision Systems and Defect Detection
