TL;DR
This paper introduces a probabilistic method to estimate genetic distances for unsequenced pathogen cases, improving spatiotemporal models by leveraging time-aware evolutionary distances without needing sequence alignment.
Contribution
It presents a novel probabilistic framework that imputes genetic distances in pathogen data, accommodating incomplete sequencing and integrating evolutionary insights into spatial models.
Findings
Effective imputation of genetic distances in avian influenza cases
Supports scalable and uncertainty-aware genomic data augmentation
Enhances integration of evolutionary data into spatiotemporal models
Abstract
Pathogen genome data offers valuable structure for spatial models, but its utility is limited by incomplete sequencing coverage. We propose a probabilistic framework for inferring genetic distances between unsequenced cases and known sequences within defined transmission chains, using time-aware evolutionary distance modeling. The method estimates pairwise divergence from collection dates and observed genetic distances, enabling biologically plausible imputation grounded in observed divergence patterns, without requiring sequence alignment or known transmission chains. Applied to highly pathogenic avian influenza A/H5 cases in wild birds in the United States, this approach supports scalable, uncertainty-aware augmentation of genomic datasets and enhances the integration of evolutionary information into spatiotemporal modeling workflows.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
