A statistical framework for comparing epidemic forests
Cyril Geismar, Peter J. White, Anne Cori, Thibaut Jombar

TL;DR
This paper introduces a statistical framework using chi-square and PERMANOVA tests to compare multiple plausible epidemic transmission trees, aiding in understanding outbreak dynamics despite limited data.
Contribution
It presents the first formal statistical method to compare epidemic forests, improving robustness over existing approaches.
Findings
PERMANOVA outperforms chi-square in sensitivity across various epidemic sizes.
Both methods achieve perfect specificity with large forests (100+ trees).
The framework is implemented in the R package mixtree.
Abstract
Inferring who infected whom in an outbreak is essential for characterising transmission dynamics and guiding public health interventions. However, this task is challenging due to limited surveillance data and the complexity of immunological and social interactions. Instead of a single definitive transmission tree, epidemiologists often consider multiple plausible trees forming \textit{epidemic forests}. Various inference methods and assumptions can yield different epidemic forests, yet no formal test exists to assess whether these differences are statistically significant. We propose such a framework using a chi-square test and permutational multivariate analysis of variance (PERMANOVA). We assessed each method's ability to distinguish simulated epidemic forests generated under different offspring distributions. While both methods achieved perfect specificity for forests with 100+…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 epidemiological studies · Zoonotic diseases and public health · Data-Driven Disease Surveillance
