Understanding When Poisson Log-Normal Models Outperform Penalized Poisson Regression for Microbiome Count Data

Daniel Agyapong; Julien Chiquet; Jane Marks; Toby Dylan Hocking

arXiv:2604.03853·cs.LG·April 7, 2026

Understanding When Poisson Log-Normal Models Outperform Penalized Poisson Regression for Microbiome Count Data

Daniel Agyapong, Julien Chiquet, Jane Marks, Toby Dylan Hocking

PDF

TL;DR

This study compares Poisson Log-Normal models and penalized Poisson regression for microbiome count data, providing practical guidance based on extensive empirical evaluation across multiple datasets.

Contribution

It offers a comprehensive evaluation framework and insights into when Poisson Log-Normal models outperform penalized Poisson regression in microbiome analysis.

Findings

01

PLN outperforms GLMNet(Poisson) in most count prediction datasets, with up to 38% gains.

02

Sample-to-taxon ratio is the primary predictor of model performance.

03

PLNNetwork excels in broad undirected interaction benchmarks, while GLMNet(Poisson) suits local or directional effects.

Abstract

Multivariate count models are often justified by their ability to capture latent dependence, but researchers receive little guidance on when this added structure improves on simpler penalized marginal Poisson regression. We study this question using real microbiome data under a unified held-out evaluation framework. For count prediction, we compare PLN and GLMNet(Poisson) on 20 datasets spanning 32 to 18,270 samples and 24 to 257 taxa, using held-out Poisson deviance under leave-one-taxon-out prediction with 3-fold sample cross-validation rather than synthetic or in-sample criteria. For network inference, we compare PLNNetwork and GLMNet(Poisson) neighborhood selection on five publicly available datasets with experimentally validated microbial interaction truth. PLN outperforms GLMNet(Poisson) on most count-prediction datasets, with gains up to 38 percent. The primary predictor of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.