# HIVGenoPipe: a nextflow pipeline for the detection of HIV-1 drug resistance using a real-time sample-specific reference sequence

**Authors:** Thoai Dotrang, Brad T. Sherman, Lisheng Dai, Muhammad Ayub Khan, Helene C. Highbarger, Whitney Bruchey, Sylvain Laverdure, Michael W. Baseler, Tomozumi Imamichi, Robin L. Dewar, Weizhong Chang

PMC · DOI: 10.1186/s12859-025-06201-5 · BMC Bioinformatics · 2025-07-07

## TL;DR

HIVGenoPipe is a new pipeline for detecting HIV-1 drug resistance using a more accurate reference sequence, improving treatment and pandemic control.

## Contribution

HIVGenoPipe introduces a real-time sample-specific reference for more accurate detection of HIV-1 drug resistance mutations.

## Key findings

- HIVGenoPipe produces more accurate gag-pol consensus sequences compared to existing pipelines.
- The pipeline's use of a sample-specific reference improves detection of drug resistance mutations.
- HIVGenoPipe is validated against HyDRA and Sanger sequencing with consistent results.

## Abstract

The emergence of HIV drug resistance is a challenge in controlling the acquired immunodeficiency syndrome (AIDS) pandemic caused by human immunodeficiency virus-1 (HIV-1) infection. Detection of drug resistance variants at minor frequencies can help to formulate successful antiretroviral therapy (ART) regimens for people living with HIV (PLWH) and reduce the emergence of drug resistance. Therefore, a pipeline which can accurately produce consensus nucleotide sequences and identify drug resistance mutations (DRMs) at defined frequency thresholds will be helpful in the treatment of PLWH, analysis of virus evolution, and the control of the pandemic.

We have developed a pipeline, HIVGenoPipe, to determine HIV drug resistance variants within the gag-pol region above user-defined frequencies for HIV-1 samples sequenced using Illumina technology. The pipeline has been validated by comparing its results with the results generated by a widely used pipeline, HyDRA, which is limited to the pol region, and with the results generated by Sanger sequencing technology using the same set of 30 samples. The variant frequency used to generate ambiguous consensus sequences in HIVGenoPipe is more accurate than other pipelines because a sample-specific reference, which is generated in real-time with a novel hybrid strategy of de novo and reference-based assembly, is used for the frequency calculation, leading to more accurate drug resistance calls for use by clinicians. In addition, since Nextflow is used as the pipeline platform, HIVGenoPipe inherently has great portability, scalability and reproducibility; and the components can be updated or replaced independently if required.

We developed HIVGenoPipe for the detection of HIV-1 drug resistance. It constructs more accurate gag-pol consensus sequences, leading to improved detection of DRMs. HIVGenoPipe is open source and freely available under the MIT license at https://github.com/LHRI-Bioinformatics/HIVGenoPipe. The current release (v1.0.1) is archived and available at https://doi.org/10.5281/zenodo.15528502.

The online version contains supplementary material available at 10.1186/s12859-025-06201-5.

## Linked entities

- **Diseases:** AIDS (MONDO:0012268)
- **Species:** Human immunodeficiency virus 1 (taxon 11676)

## Full-text entities

- **Genes:** gag (Pr55(Gag)) [NCBI Gene 155030]
- **Diseases:** AIDS (MESH:D000163), HIV (MESH:D015658), PLWH (MESH:C000719191)
- **Species:** Human immunodeficiency virus 1 (no rank) [taxon 11676]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12235847/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12235847/full.md

## References

3 references — full list in the complete paper: https://tomesphere.com/paper/PMC12235847/full.md

---
Source: https://tomesphere.com/paper/PMC12235847