Inference on High-Dimensional Sparse Count Data
Jyotishka Datta, David B. Dunson

TL;DR
This paper introduces a new Bayesian approach with local-global shrinkage priors for analyzing high-dimensional sparse count data, improving flexibility and accuracy over existing models.
Contribution
The authors develop a novel class of continuous shrinkage priors specifically designed for sparse count data, with theoretical guarantees and superior empirical performance.
Findings
Strong posterior concentration properties.
Enhanced control of false discoveries in multiple testing.
Robustness and super-efficiency demonstrated in simulations.
Abstract
In a variety of application areas, there is a growing interest in analyzing high dimensional sparse count data, with sparsity exhibited by an over-abundance of zeros and small non-zero counts. Existing approaches for analyzing multivariate count data via Poisson or negative binomial log-linear hierarchical models with zero-inflation cannot flexibly adapt to the level and nature of sparsity in the data. We develop a new class of continuous local-global shrinkage priors tailored for sparse counts. Theoretical properties are assessed, including posterior concentration, stronger control on false discoveries in multiple testing, robustness in posterior mean and super-efficiency in estimating the sampling density. Simulation studies illustrate excellent small sample properties relative to competitors. We apply the method to detect rare mutational hotspots in exome sequencing data and to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic Associations and Epidemiology · Statistical Methods and Inference · Bayesian Methods and Mixture Models
