# Classification of virulence factors based on dual-channel neural networks with pre-trained language models

**Authors:** Guanghui Li, Peiyang Song, Jiawei Luo, Cheng Liang, Aiqing Fang, Aiqing Fang, Aiqing Fang

PMC · DOI: 10.1371/journal.pone.0340194 · PLOS One · 2026-01-05

## TL;DR

This paper introduces a new model called PLM-GNN that accurately classifies virulence factors using both structural and sequence-based features.

## Contribution

The novel dual-channel model PLM-GNN combines geometric graph networks and pre-trained language models for VF classification.

## Key findings

- PLM-GNN achieved 86.47% accuracy in classifying seven major virulence factor types.
- The model reached an AUC of 97.20%, demonstrating strong performance on an independent test set.

## Abstract

Virulence factors (VFs) are crucial molecules that enable pathogens to cause infection and disease in a host. They allow pathogens to evade the host’s immune defenses and facilitate the progression of infection through various mechanisms. With the increasing prevalence of antibiotic-resistant strains and the emergence of new and re-emerging infectious agents, the classification of VFs has become more critical. This study presents PLM-GNN, an innovative dual-channel model designed for precise classification of VFs, focusing on the seven most numerous types. It integrates a structure channel, which employs a geometric graph neural network to capture the three-dimensional structure features of VFs, and a sequence channel that utilizes a pre-trained language model with Convolutional Neural Network (CNN) and Transformer architectures to extract local and global features from VF sequences, respectively. On the independent test set, the method achieved an accuracy of 86.47%, an F1 score of 86.20% and an Area Under the Receiver Operating Characteristic Curve (AUC) of 97.20%, validating its effectiveness. In conclusion, PLM-GNN can precisely classify the seven major VFs, offering a novel approach for studying their functions.

## Full-text entities

- **Diseases:** infection (MESH:D007239)
- **Chemicals:** PLM-GNN (-)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12768247/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12768247/full.md

## References

66 references — full list in the complete paper: https://tomesphere.com/paper/PMC12768247/full.md

---
Source: https://tomesphere.com/paper/PMC12768247