Zero-inflation in the Multivariate Poisson Lognormal Family
Bastien Batardi\`ere, Julien Chiquet, Fran\c{c}ois Gindraud, Mahendra Mariadassou

TL;DR
This paper introduces the Zero-Inflated Poisson Lognormal (ZIPLN) model, extending the PLN model to better handle datasets with high zero-inflation, improving fit and interpretability in high-dimensional count data analysis.
Contribution
The paper proposes a novel ZIPLN model that incorporates zero-inflation into the PLN framework, with scalable variational inference methods for high-dimensional data.
Findings
ZIPLN effectively models datasets with up to 90% zero counts.
Accounting for zero-inflation improves model fit and latent space interpretability.
Application to microbiome data shows enhanced group discrimination.
Abstract
Analyzing high-dimensional count data is a challenge and statistical model-based approaches provide an adequate and efficient framework that preserves explainability. The (multivariate) Poisson-Log-Normal (PLN) model is one such model: it assumes count data are driven by an underlying structured latent Gaussian variable, so that the dependencies between counts solely stems from the latent dependencies. However PLN doesn't account for zero-inflation, a feature frequently observed in real-world datasets. Here we introduce the Zero-Inflated PLN (ZIPLN) model, adding a multivariate zero-inflated component to the model, as an additional Bernoulli latent variable. The Zero-Inflation can be fixed, site-specific, feature-specific or depends on covariates. We estimate model parameters using variational inference that scales up to datasets with a few thousands variables and compare two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Bayesian Inference · Statistical Methods and Inference · Data Analysis with R
MethodsVariational Inference
