Modeling Sparse Data Using MLE with Applications to Microbiome Data

Hani Aldirawi; Jie Yang

arXiv:2112.13903·stat.ME·December 30, 2021

Modeling Sparse Data Using MLE with Applications to Microbiome Data

Hani Aldirawi, Jie Yang

PDF

Open Access

TL;DR

This paper develops MLE methods for zero-inflated and hurdle models, including new models like zero-inflated beta negative binomial, demonstrating improved fit for microbiome data compared to existing models.

Contribution

It derives MLE and Fisher information for a broad class of zero-inflated models and introduces new models tailored for microbiome data analysis.

Findings

01

New models outperform traditional ones in microbiome data fitting

02

MLE provides reliable parameter estimation for zero-inflated models

03

Application shows improved modeling accuracy for sparse microbiome data

Abstract

Modeling sparse data such as microbiome and transcriptomics (RNA-seq) data is very challenging due to the exceeded number of zeros and skewness of the distribution. Many probabilistic models have been used for modeling sparse data, including Poisson, negative binomial, zero-inflated Poisson, and zero-inflated negative binomial models. One way to identify the most appropriate probabilistic models for zero-inflated or hurdle models is based on the p-value of the Kolmogorov-Smirnov (KS) test. The main challenge for identifying the probabilistic model is that the model parameters are typically unknown in practice. This paper derives the maximum likelihood estimator (MLE) for a general class of zero-inflated and hurdle models. We also derive the corresponding Fisher information matrices for exploring the estimator's asymptotic properties. We include new probabilistic models such as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Methods and Mixture Models · Statistical Methods and Inference · Statistical Methods and Bayesian Inference