# HIV-phyloTSI: subtype-independent estimation of time since HIV-1 infection for cross-sectional measures of population incidence using deep sequence data

**Authors:** Tanya Golubchik, Lucie Abeler-Dörner, Matthew Hall, Chris Wymant, David Bonsall, George Macintyre-Cockett, Laura Thomson, Jared M. Baeten, Connie L. Celum, Ronald M. Galiwango, Barry Kosloff, Mohammed Limbada, Andrew Mujugira, Nelly R. Mugo, Astrid Gall, François Blanquart, Margreet Bakker, Daniela Bezemer, Swee Hoe Ong, Jan Albert, Norbert Bannert, Jacques Fellay, Barbara Gunsenheimer-Bartmeyer, Huldrych F. Günthard, Pia Kivelä, Roger D. Kouyos, Laurence Meyer, Kholoud Porter, Ard van Sighem, Mark van der Valk, Ben Berkhout, Paul Kellam, Marion Cornelissen, Peter Reiss, Helen Ayles, David N. Burns, Sarah Fidler, Mary Kate Grabowski, Richard Hayes, Joshua T. Herbeck, Joseph Kagaayi, Pontiano Kaleebu, Jairam R. Lingappa, Deogratius Ssemwanga, Susan H. Eshleman, Myron S. Cohen, Oliver Ratmann, Oliver Laeyendecker, Christophe Fraser

PMC · DOI: 10.1186/s12859-025-06189-y · BMC Bioinformatics · 2025-08-14

## TL;DR

This paper introduces HIV-phyloTSI, a method to estimate the time since HIV infection using deep sequencing data, enabling more accurate population-level tracking of the HIV epidemic.

## Contribution

HIV-phyloTSI provides a continuous, subtype-independent time-since-infection estimation using within-host diversity and divergence from deep sequencing data.

## Key findings

- HIV-phyloTSI estimates TSI up to 9 years with a mean absolute error of less than 12 months.
- The method performs equally well across all major HIV subtypes in African and European cohorts.
- It achieves less than 5 months error for infections within the first year.

## Abstract

Estimating the time since HIV infection (TSI) at population level is essential for tracking changes in the global HIV epidemic. Most methods for determining TSI give a binary classification of infections as recent or non-recent within a window of several months, and cannot assess the cumulative impact of an intervention.

We developed a Random Forest Regression model, HIV-phyloTSI, which combines measures of within-host diversity and divergence to generate continuous TSI estimates directly from viral deep-sequencing data, with no need for additional variables. HIV-phyloTSI provides a continuous measure of TSI up to 9 years, with a mean absolute error of less than 12 months overall and less than 5 months for infections with a TSI of up to a year. It performs equally well for all major HIV subtypes based on data from African and European cohorts.

We demonstrate how HIV-phyloTSI can be used for incidence estimates on a population level.

The online version contains supplementary material available at 10.1186/s12859-025-06189-y.

## Full-text entities

- **Diseases:** HIV (MESH:D015658), infections (MESH:D007239)
- **Species:** Human immunodeficiency virus 1 (no rank) [taxon 11676]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12351810/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12351810/full.md

## References

7 references — full list in the complete paper: https://tomesphere.com/paper/PMC12351810/full.md

---
Source: https://tomesphere.com/paper/PMC12351810