# A Comparative Evaluation of Four Bioinformatic Tools for Identifying HIV-1 pol Drug Resistance Mutations Using Illumina MiSeq Data

**Authors:** Ogestelli Fabia Lee, Chun Kiat Lee

PMC · DOI: 10.3390/biology15050438 · 2026-03-07

## TL;DR

This study compares four tools for identifying HIV drug resistance mutations in next-generation sequencing data and finds that a custom de novo assembly method is the most accurate.

## Contribution

The study demonstrates that a custom de novo assembly workflow outperforms existing tools in detecting low-abundance HIV-1 drug resistance mutations.

## Key findings

- Most tools struggle with low-abundance mutations or complex genetic structures.
- The custom de novo assembly method achieved perfect agreement in mutation detection.
- Quasitools had the lowest agreement due to aligner-induced reference bias and lower sensitivity.

## Abstract

Successful treatment of the human immunodeficiency virus depends on identifying genetic changes that make antiretroviral therapy medications ineffective. For decades, a traditional laboratory method called Sanger sequencing was the gold standard, but it often fails to detect rare, resistant versions of the virus that exist at low levels. Laboratories are now transitioning to next-generation sequencing, which is much more sensitive but relies on complex analysis workflows that can produce inconsistent results. This study addressed the variability in bioinformatic tools used to identify these changes. We compared four bioinformatic tools to determine which best identifies these low-abundance mutations using eighty-five next-generation sequencing datasets. We found that while most tools work well for common mutations, they struggle with low-abundance mutations or complex genetic structures. A custom approach that reconstructs the viral genetic code from scratch, known as de novo assembly, was the most accurate. Other tools missed critical resistance markers or misclassified genetic changes. These findings prove that the choice of bioinformatic tool is a vital part of medical care. By using more precise tools, laboratories can provide more reliable reports, ensuring patients receive the most effective treatments.

The transition from Sanger to next-generation sequencing (NGS) for HIV-1 drug resistance testing offers enhanced sensitivity but also introduces bioinformatic variability. This study evaluated four strategies: the commercial Exatype platform, the academic Stanford HIVdb-NGS, the open-source Quasitools (HyDRA) suite, and a custom de novo assembly workflow, iLunaR. Using 85 clinical HIV-1 pol MiSeq sequencing datasets, concordance was assessed at a 2% mutation detection threshold. A majority consensus standard defined true presence if a mutation was detected by at least three pipelines and supported by Sanger sequencing. While the datasets were successfully processed by all pipelines, discordances emerged in detecting low-abundance mutations and a specific case of structural mutation. iLunaR achieved perfect agreement (Cohen’s kappa = 1.000; 95% CI: 1.000–1.000). Quasitools demonstrated the lowest agreement (Cohen’s kappa = 0.901; 95% CI: 0.807–0.995) due to consistent reporting of mutations at lower abundance levels and aligner-induced reference bias misclassifying a deletion as a point mutation. Exatype (Cohen’s kappa = 0.951; 95% CI: 0.884–1.000) and Stanford (Cohen’s kappa = 0.926; 95% CI: 0.846–1.000) exhibited specific failures, including an omitted integrase mutation and codon translation errors, respectively. These findings confirm that bioinformatic algorithm choice remains a critical clinical variable despite NGS advancements in HIV-1 drug resistance testing.

## Full-text entities

- **Species:** Human immunodeficiency virus 1 (no rank) [taxon 11676]

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC12984109/full.md

---
Source: https://tomesphere.com/paper/PMC12984109