# Hepatitis C Virus Saint Petersburg Variant Detection With Machine Learning Methods

**Authors:** Nurhan Arslan, Bernhard Reuter, Joachim Buech, Thomas Lengauer, Martin Obermeier, Rolf Kaiser, Nico Pfeifer

PMC · DOI: 10.1002/jmv.70169 · Journal of Medical Virology · 2025-02-17

## TL;DR

This paper introduces machine learning models to detect a specific Hepatitis C variant, 2k/1b, which is important for effective treatment when certain drugs are unavailable.

## Contribution

The paper presents novel machine learning models integrated into an open-access tool for accurate detection of the 2k/1b HCV variant.

## Key findings

- Machine learning models were developed using nonstructural protein sequences to detect the 2k/1b variant.
- The models were integrated into the geno2pheno[HCV] tool for open-access use by researchers and clinicians.
- Accurate detection of the 2k/1b variant is critical for treatment planning in regions where pan-genotypic DAAs are not available.

## Abstract

Hepatitis C virus infection is a significant global health concern, affecting millions worldwide. Although direct‐acting antivirals achieve over 90% success rate, treatment failures still occur, particularly when pan‐genotypic DAAs are unavailable, and drugs need to be chosen based on the present HCV genotype. Genotyping tests can be misleading, especially in cases involving the 2k/1b recombinant variant. The 2k/1b variant was first discovered in Saint Petersburg in 2002 and is most commonly observed in Eastern European countries, including Russia, Georgia, and Ukraine. Due to migration, the 2k/1b variant has spread to Western Europe and other regions, potentially increasing HCV transmission and changing the virus's epidemiological landscape. The situation highlights the importance of molecular epidemiology in monitoring the spread of the 2k/1b variant. Accurate detection and characterization of the 2k/1b variant are crucial for an effective treatment if no pan‐genotypic DAAs are available. To address this need, machine learning models were developed to predict the 2k/1b variant based on 1b and 2k/1b sequence data from nonstructural proteins. They were integrated into the geno2pheno[HCV] tool, providing physicians and researchers with an open‐access resource for determining HCV genotypes, including the 2k/1b variant.

## Linked entities

- **Diseases:** Hepatitis C virus infection (MONDO:0005231)

## Full-text entities

- **Diseases:** Hepatitis C virus infection (MESH:D006526)
- **Species:** Hepatitis C Virus [taxon 11103]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11831414/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11831414/full.md

## References

48 references — full list in the complete paper: https://tomesphere.com/paper/PMC11831414/full.md

---
Source: https://tomesphere.com/paper/PMC11831414