# TXSelect: A multi-task learning model to identify secretory effectors

**Authors:** Jing Li, Qing Liu, Quan Zou, Chao Zhan

PMC · DOI: 10.1371/journal.pcbi.1013677 · PLOS Computational Biology · 2025-11-06

## TL;DR

TXSelect is an AI model that accurately classifies bacterial secretory effectors, helping understand infection mechanisms and develop treatments.

## Contribution

TXSelect introduces a multi-task learning framework combining ESM embeddings and classical descriptors for effector classification.

## Key findings

- TXSelect achieves high accuracy (validation F1 = 0.867, test F1 = 0.8645) in classifying secretory effectors.
- The model uses ESM N-terminal mean + DR + SC-PseAAC as the optimal feature combination.
- TXSelect provides interpretable insights into molecular features distinguishing effector types.

## Abstract

Secretory effectors from pathogenic microorganisms significantly influence pathogen survival and pathogenicity by manipulating host signalling, immune responses, and metabolic processes. However, because of sequence and structural heterogeneity among bacterial effectors, accurately classifying multiple types simultaneously remains challenging. Therefore, we developed TXSelect, a multi-task learning framework that simultaneously classifies TXSE (types I, II, III, IV and VI secretory effectors) using a shared backbone network with task-specific heads. TXSelect integrates the protein embedding features of evolutionary scale modelling (ESM), particularly the N-terminal mean, with classical descriptors to effectively capture complementary information. These descriptors include distance-based residue (DR) and split amino acid composition general (SC-PseAAC-General). Rigorous evaluation identified ESM N-terminal mean + DR + SC-PseAAC as the optimal feature combination, achieving high accuracy (validation F1 = 0.867, test F1 = 0.8645) and robust generalization. Comprehensive assessments and visualization with Uniform Manifold Approximation and Projection further validated the discriminative capability and interpretability of the model. TXSelect provides an efficient computational tool for accurately classifying bacterial effectors, supporting deeper biological understanding and potential therapeutic development.

Secretory effectors are specialized proteins produced by pathogenic bacteria that allow them to infect host organisms by disrupting normal cellular functions. Accurately identifying and classifying these effectors is crucial for understanding infection mechanisms and developing new treatments, but this task is complicated by the high diversity in their sequences and structures. Here, we present TXSelect, a new artificial intelligence model that uses multi-task deep learning to simultaneously recognize multiple types of bacterial secretory effectors. Our model combines advanced protein sequence embeddings from large-scale evolutionary models with classical biochemical descriptors, enabling it to capture more information than either method alone. We rigorously evaluated TXSelect using multiple datasets and strict experimental protocols, demonstrating that it achieves high accuracy and robust performance even under challenging scenarios where the similarities between training and test samples are minimized. Additionally, our analyses provide interpretable insights into which molecular features are most important for distinguishing different effector types.

## Full-text entities

- **Chemicals:** amino acid (MESH:D000596)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12591437/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12591437/full.md

## References

55 references — full list in the complete paper: https://tomesphere.com/paper/PMC12591437/full.md

---
Source: https://tomesphere.com/paper/PMC12591437