# Optimizing genomic sampling for demographic and epidemiological inference with Markov decision processes

**Authors:** David A Rasmussen, Madeline G Bursell, Frank Burkhart

PMC · DOI: 10.1093/genetics/iyaf244 · Genetics · 2025-11-11

## TL;DR

This paper introduces a new framework using Markov decision processes to optimize genomic sampling strategies for better demographic and epidemiological insights.

## Contribution

The novel use of Markov decision processes to model and optimize genomic sampling strategies for maximizing information gain.

## Key findings

- Markov decision processes can efficiently identify optimal sampling strategies for estimating population growth rates.
- The framework helps minimize transmission distance between sampled individuals in genomic epidemiology.
- Optimal sampling strategies can be identified for estimating migration rates between subpopulations.

## Abstract

Inferences from population genomic data provide valuable insights into the demographic history of a population. Likewise, in genomic epidemiology, pathogen genomic data provide key insights into epidemic dynamics and potential sources of transmission. Yet, predicting what information will be gained from genomic data about variables of interest and how different sampling strategies will impact the quality of downstream inferences remains challenging. As a result, population genomics and related fields such as phylodynamics and phylogeography largely lack theory to guide decisions on how best to sample individuals for genomic sequencing. By adopting a sequential decision making framework based on Markov decision processes, we model how sampling interacts with a population’s demographic history to shape the ancestral or genealogical relationships of sampled individuals. By probabilistically considering these ancestral relationships, we can use Markov decision processes to predict the expected value of sampling in terms of information gained about estimated variables. This in turn allows us to very efficiently explore and identify optimal sampling strategies even when the informational value of sampling depends on past or future sampling events. To illustrate our framework, we develop Markov decision processes for three common demographic and epidemiological inference problems: estimating population growth rates, minimizing the transmission distance between sampled individuals and estimating migration rates between subpopulations. In each case, the Markov decision process allows us to identify optimal sampling strategies that maximize the information gained from genomic data while minimizing the associated costs of sampling.

## Full-text entities

- **Diseases:** MDP (MESH:D020195), death (MESH:D003643), infectious pathogen (MESH:D003141), infected (MESH:D007239)
- **Chemicals:** gold (MESH:D006046)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12774829/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12774829/full.md

## References

65 references — full list in the complete paper: https://tomesphere.com/paper/PMC12774829/full.md

---
Source: https://tomesphere.com/paper/PMC12774829