# How does date-rounding affect phylodynamic inference for public health?

**Authors:** Leo A. Featherstone, Danielle J. Ingle, Wytamma Wirth, Sebastian Duchene, Joel O. Wertheim, Virginia E. Pitzer, Joel O. Wertheim, Virginia E. Pitzer, Joel O. Wertheim, Virginia E. Pitzer

PMC · DOI: 10.1371/journal.pcbi.1012900 · 2025-04-11

## TL;DR

This paper examines how reducing the resolution of sampling dates affects the accuracy of phylodynamic inferences used in public health.

## Contribution

The study provides a practical guideline for identifying when date-rounding biases phylodynamic parameter estimates.

## Key findings

- Date-rounding introduces bias in epidemiological parameter inference, with direction varying by parameter and dataset.
- Bias decreases with longer sampling intervals, making the guideline most relevant for emerging datasets.
- A method for safer date sharing is proposed to protect patient confidentiality while preserving inference accuracy.

## Abstract

Phylodynamic analyses infer epidemiological parameters from pathogen genome sequences for enhanced genomic surveillance in public health. Pathogen genome sequences and their associated sampling dates are the essential data in every analysis. However, sampling dates are usually associated with hospitalisation or testing and can sometimes be used to identify individual patients, posing a threat to patient confidentiality. To lower this risk, sampling dates are often given with reduced date-resolution to the month or year, which can potentially bias inference. Here, we introduce a practical guideline on when date-rounding biases the inference of epidemiologically important parameters across a diverse range of empirical and simulated datasets. We show that the direction of bias varies for different parameters, datasets, and tree priors, while compounding with lower date-resolution and higher substitution rates. We also find that bias decreases for datasets with longer sampling intervals, implying that our guideline is most applicable to emerging datasets. We conclude by discussing future solutions that prioritise patient confidentiality and propose a method for safer sharing of sampling dates that translates them them uniformly by a random number.

Phylodynamic analyses estimate epidemiological parameters using pathogen genome sequences and offer insight for public health. The essential data in every analysis are genome sequences, which allow measurement of evolutionary divergence, and their associated sampling times, which allow evolutionary divergence to be modelled as a rate over time. However, the sampling times of pathogen genome sequences are frequently associated with hospitalisation and can be used to identify particular patients. As a result, sampling times are often shared between public health labs and phylodynamics practitioners with reduced date resolution to protect patient identity (such as to the month or year). Using real-world data and a matching simulation study, we emulate the effects of date rounding on phylodynamic inference to characterise how reduced date resolution introduces error into inference. We find that error arises where sampling dates are given at a resolution less than the average amount of time it takes for a pathogen to accrue one substitution. We find that this relationship is useful for predicting biased estimation for datasets reflecting short term sampling. We conclude by discussing how accurate sampling dates can be shared in a way that preserves both patient identity and accuracy

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

50 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11991728/full.md

---
Source: https://tomesphere.com/paper/PMC11991728