# MolEpidPred: a novel computational tool for the molecular epidemiology of foot-and-mouth disease virus using VP1 nucleotide sequence data

**Authors:** Samarendra Das, Utkal Nayak, Soumen Pal, Saravanan Subramaniam

PMC · DOI: 10.1093/bfgp/elaf001 · Briefings in Functional Genomics · 2025-03-05

## TL;DR

MolEpidPred is a fast and accurate computational tool for predicting the molecular epidemiology of foot-and-mouth disease virus using VP1 nucleotide sequences.

## Contribution

The study introduces MolEpidPred, a novel computational tool that enables rapid and accurate prediction of FMD virus serotype, topotype, and lineage.

## Key findings

- Support vector machine, random forest, XGBoost, and AdaBoost algorithms achieved high accuracy in serotype prediction.
- MolEpidPred achieved ≥98% accuracy for serotype prediction on independent datasets and 100% accuracy on wet-lab data.
- The tool provides results in seconds and outperforms traditional phylogenetic analysis in speed and accuracy.

## Abstract

Molecular epidemiology of Foot-and-mouth disease (FMD) is crucial to implement its control strategies including vaccination and containment, which primarily deals with knowing serotype, topotype, and lineage of the virus. The existing approaches including serotyping are biological in nature, which are time-consuming and risky due to live virus handling. Thus, novel computational tools are highly required for large-scale molecular epidemiology of the FMD virus. This study reported a comprehensive computational tool for FMD molecular epidemiology. Ten learning algorithms were initially evaluated on cross-validated and ten independent secondary datasets for serotype prediction using sequence-based features through accuracy, sensitivity and 14 other metrics. Next, best performing algorithms, with higher serotype predictive accuracies, were evaluated for topotype and lineage prediction using cross-validation. These algorithms are implemented in the computational tool. Then, performance of the developed approach was assessed on five independent secondary datasets, never seen before, and primary experimental data. Our cross-validated and independent evaluation of learning algorithms for serotype prediction revealed that support vector machine, random forest, XGBoost, and AdaBoost algorithms outperformed others. Then, these four algorithms were evaluated for topotype and lineage prediction, which achieved accuracy ≥96% and precision ≥95% on cross-validated data. These algorithms are implemented in the web-server (https://nifmd-bbf.icar.gov.in/MolEpidPred), which allows rapid molecular epidemiology of FMD virus. The independent validation of the MolEpidPred observed accuracies ≥98%, ≥90%, and ≥ 80% for serotype, topotype, and lineage prediction, respectively. On wet-lab data, the MolEpidPred tool provided results in fewer seconds and achieved accuracies of 100%, 100%, and 96% for serotype, topotype, and lineage prediction, respectively, when benchmarked with phylogenetic analysis. MolEpidPred tool provides an innovative platform for large-scale molecular epidemiology of FMD virus, which is crucial for tracking FMD virus infection and implementing control program.

## Linked entities

- **Proteins:** VP1 (pyrophosphate-energized vacuolar membrane proton pump 1)
- **Diseases:** Foot-and-mouth disease (MONDO:0005765)

## Full-text entities

- **Diseases:** FMD (MESH:D005536)
- **Species:** Foot-and-mouth disease virus (no rank) [taxon 12110]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11881699/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11881699/full.md

## References

50 references — full list in the complete paper: https://tomesphere.com/paper/PMC11881699/full.md

---
Source: https://tomesphere.com/paper/PMC11881699