# Assessing bias in susceptible–infected–recovered estimation from aggregated epidemic data

**Authors:** Naijian Shen, Lydia Bourouiba

PMC · DOI: 10.1098/rsos.240526 · Royal Society Open Science · 2025-07-23

## TL;DR

This paper shows how combining epidemic data from different regions can lead to incorrect estimates of disease spread severity.

## Contribution

The study introduces three new analytical methods to extract SIR parameters from aggregated epidemic data.

## Key findings

- Aggregated SIR data can produce misleading R0 estimates even when predictions match incidence data well.
- Stronger trailing epidemic waves maintain apparent unimodal data for longer temporal offsets.
- Weaker epidemic waves maintain unimodal data for longer offsets when epidemic strengths are equivalent.

## Abstract

The canonical susceptible–infected–recovered (SIR) epidemic model is ubiquitous in assessing severity to guide interventions. It is typically applied to hierarchically aggregated data from distinct sub-regions. The introduced heterogeneity can lead to significant errors in estimated epidemic severity. We develop three analytical methods to extract SIR parameters from data, focusing on the reproduction number R0 that quantifies epidemic wave severity/strength. The estimation methods are applied to synthetically aggregated incidence data formulated by summing two independent SIR solutions of distinct R0 and separated by an onset delay, i.e. temporal offset. When applying the SIR model, we find that R0 estimates from the aggregated data can underestimate or overestimate the constituent epidemic waves’ R0 even when the prediction appears to agree well with the incidence data, resulting in an erroneous unimodal epidemic dynamics. We find that for two epidemic waves, the stronger the trailing wave, the longer the temporal offset that maintains apparent erroneous unimodal aggregated data. In the special case of two equivalent epidemic strengths, however, the weaker the waves, the longer the offset that maintains apparent unimodal aggregated data. We provide sensitivity analyses with respect to noise perturbation of the data and illustrate our approach using historical influenza data.

## Linked entities

- **Diseases:** influenza (MONDO:0005812)

## Full-text entities

- **Diseases:** influenza (MESH:D007251)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12289188/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12289188/full.md

## References

36 references — full list in the complete paper: https://tomesphere.com/paper/PMC12289188/full.md

---
Source: https://tomesphere.com/paper/PMC12289188