# AI-Driven Microbial Diagnostics: Predicting Disease Signatures Through Microbial Pattern Recognition

**Authors:** Saleha Y. M. Alakilli, Mohamed Nabil Ibrahim, Awadh Alanazi, Eman Fawzy El Azab, Khaled Alzhrani, Osama R. Shahin, Bi Bi Zainab Mazhari, Mohamed Atif A. Said Ahmed

PMC · DOI: 10.3390/diagnostics16050688 · Diagnostics · 2026-02-26

## TL;DR

This paper introduces DysbioFormer, an AI model that improves disease prediction by analyzing gut microbiome patterns more effectively than previous methods.

## Contribution

The novel contribution is DysbioFormer, a multiset transformer framework that captures complex microbial interactions and improves disease diagnostics.

## Key findings

- DysbioFormer achieved 97% accuracy, 0.97 AUC, and 96% F1-score in predicting diseases from gut microbiome data.
- The model outperformed classical machine learning models in diagnostic performance.
- Attention-derived signatures provided interpretable links between microbial taxa and disease predictions.

## Abstract

Background/Objectives: Predicting diseases based on the gut microbiome pattern is still difficult because of compositional shortcomings, batch heterogeneity, and scanty modeling of inter-taxon interactions. This study introduces a Dysbiosis-Aware Multiset Transformer Framework called DysbioFormer, which predicts state diseases by recognizing patterns of microbes. Methods: The current methods are mainly based on flat abundance representations or fixed-order models which limit the capability of describing intricate interactions of communities and evolutionary structure. Results: DysbioFormer is a solution to these shortcomings, in which each sample of the microbiome is modeled as a permutation-invariant multiset of taxonomic tokens with compositional, phylogenetic, and harmonized cohort data. Stacked Set Attention Blocks are used to learn relational dependencies between taxa, whereas Pooling-by-Multihead-Attention is used to aggregate global disease-level embeddings and this is not based on sequence assumptions. The model has been tested on MicrobiomeHD, which consists of a wide variety of human gut microbiome samples at a variety of disease conditions and healthy controls. Experimental results demonstrate strong diagnostic performance, achieving an accuracy of 97%, an AUC of 0.97, and an F1-score of 96%, consistently outperforming classical machine learning models under identical evaluation protocols. Attention-derived signatures also can give interpretable connections among predictive results and disease-linked microbial taxa, enhancing biological plausibility. Conclusions: The suggested architecture enables scalable, cohort-agnostic microbial diagnostics, and provides a principled route to transforming the complex information of the microbiome into reliable clinical information. DysbioFormer creates a universal basis of future microbiome-based disease screening and precision health uses. Its design allows extending towards multi-omics integration, longitudinal studies, and decision-support infrastructure, supporting microbiome-informed translational medicine in a variety of clinical research settings.

## Full-text entities

- **Diseases:** Dysbiosis (MESH:D064806)
- **Species:** Homo sapiens (human, species) [taxon 9606], gut metagenome (species) [taxon 749906]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12984828/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12984828/full.md

## References

28 references — full list in the complete paper: https://tomesphere.com/paper/PMC12984828/full.md

---
Source: https://tomesphere.com/paper/PMC12984828