# Using zero-inflated and hurdle regression models to analyze schistosomiasis data of school children in the southern areas of Ghana

**Authors:** Kojo Nketia, Dziedzom K. de Souza

PMC · DOI: 10.1371/journal.pone.0304681 · PLOS ONE · 2024-07-12

## TL;DR

This study compares statistical models for analyzing schistomiasis data in Ghanaian school children, finding that zero-inflated and hurdle models perform best with highly zero-inflated count data.

## Contribution

The study evaluates and recommends zero-inflated and hurdle regression models for handling over-dispersed, zero-inflated schistosomiasis count data.

## Key findings

- 94.1% of participants had zero schistosomiasis eggs detected, indicating high zero inflation.
- Zero-inflated negative binomial and hurdle negative binomial models outperformed Poisson-based models.
- ZINB and HNB models provided the best fit for over-dispersed, zero-inflated count data.

## Abstract

Schistosomiasis is a neglected disease prevalent in tropical and sub-tropical areas of the world, especially in Africa. Detecting the presence of the disease is based on the detection of the parasites in the stool or urine of children and adults. In such studies, typically, data collected on schistosomiasis infection includes information on many negative individuals leading to a high zero inflation. Thus, in practice, counts data with excessive zeros are common. However, the purpose of this analysis is to apply statistical models to the count data and evaluate their performance and results.

This is a secondary analysis of previously collected data. As part of a modelling process, a comparison of the Poisson regression, negative binomial regression and their associated zero inflated and hurdle models were used to determine which offered the best fit to the count data.

Overall, 94.1% of the study participants did not have any schistosomiasis eggs out of 1345 people tested, resulting in a high zero inflation. The performance of the negative binomial regression models (hurdle negative binomial (HNB), zero inflated negative binomial (ZINB) and the standard negative binomial) were better than the Poisson-based regression models (Poisson, zero inflated Poisson, hurdle Poisson). The best models were the ZINB and HNB and their performances were indistinguishable according to information-based criteria test values.

The zero-inflated negative binomial and hurdle negative binomial models were found to be the most satisfactory fit for modelling the over-dispersed zero inflated count data and are recommended for use in future statistical modelling analyses.

## Linked entities

- **Diseases:** schistosomiasis (MONDO:0015254)

## Full-text entities

- **Diseases:** Schistosomiasis (MESH:D012552)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11244785/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11244785/full.md

## References

42 references — full list in the complete paper: https://tomesphere.com/paper/PMC11244785/full.md

---
Source: https://tomesphere.com/paper/PMC11244785