# Genomic Epidemiology and Machine Learning–Based Drug Discovery for Antimicrobial Resistant Diarrheagenic Escherichia coli

**Authors:** Ayesha Masood, Fatima Noor, Abdu Rehman, Mohsin Gulzar Barq, Shazia Iqbal, Muhammad Qasim Ali, Shahzad Ahmad, Syed Zeeshan Haider Naqvi

PMC · DOI: 10.1002/mbo3.70236 · MicrobiologyOpen · 2026-02-22

## TL;DR

This study combines genomic analysis and machine learning to discover new drugs against antibiotic-resistant E. coli causing diarrhea in children.

## Contribution

Integration of genomic epidemiology and machine learning to identify novel drug candidates against multidrug-resistant diarrheagenic E. coli.

## Key findings

- Enteropathogenic E. coli is the most prevalent DEC pathotype, followed by enterotoxigenic and enterohemorrhagic E. coli.
- High resistance to ampicillin, trimethoprim-sulfamethoxazole, and erythromycin was observed, while carbapenems and colistin remained effective.
- Alatamide and Isosativan showed strong binding affinities and structural stability against DEC virulence targets.

## Abstract

Diarrheagenic Escherichia coli (DEC) is a leading cause of pediatric diarrhea, with antimicrobial resistance (AMR) complicating treatment. This study analyzed 350 E. coli isolates (175 DEC and 175 non‐DEC) to determine molecular pathotypes, resistance patterns, and therapeutic targets. Polymerase chain reaction and 16S ribosomal RNA sequencing identified enteropathogenic E. coli as the most prevalent DEC pathotype (35%), followed by enterotoxigenic E. coli (25%), enterohemorrhagic E. coli (15%), enteroinvasive E. coli (10%), and diffusely adherent E. coli (20%). Phylogenetic analysis confirmed distinct clustering between DEC and non‐DEC strains, revealing their evolutionary relationships. Antimicrobial susceptibility testing showed high resistance to ampicillin (87.6%), trimethoprim‐sulfamethoxazole (75.5%), and erythromycin (100%), while carbapenems and colistin retained effectiveness. Functional analysis using phylogenetic investigation of communities by reconstruction of unobserved states (PICRUSt) indicated enhanced metabolic and immune‐related functions in DEC strains, differentiating them from non‐DEC strains. Machine learning and bioinformatics‐driven drug discovery identified Alatamide and Isosativan as potential therapeutic compounds, exhibiting strong binding affinities and structural stability against DEC virulence targets through molecular docking and molecular dynamics simulations. This study provides critical insights into the epidemiology, genetic diversity, and resistance patterns of DEC and non‐DEC strains. The integration of bioinformatics and machine learning offers a promising strategy for discovering alternative treatments. Continuous AMR surveillance, responsible antibiotic use, and further experimental validation of identified drug candidates are essential to managing E. coli‐associated diarrheal infections in pediatric populations and mitigating the global burden of multidrug‐resistant pathogens.

This study integrates genomic epidemiology and machine learning–based virtual screening to identify novel therapeutic candidates, Alatamide and Isosativan, with strong binding stability against multidrug‐resistant diarrheagenic Escherichia coli virulence targets, highlighting computational drug discovery as a promising strategy to combat antimicrobial resistance in pediatric diarrheal infections.

## Linked entities

- **Chemicals:** Isosativan (PubChem CID 591624), ampicillin (PubChem CID 6249), trimethoprim-sulfamethoxazole (PubChem CID 358641), erythromycin (PubChem CID 12560), carbapenems (PubChem CID 134085), colistin (PubChem CID 5311054)
- **Species:** Escherichia coli (taxon 562)

## Full-text entities

- **Genes:** STX2 (syntaxin 2) [NCBI Gene 2054] {aka EPIM, EPM, STX2A, STX2B, STX2C}, aggR [NCBI Gene 13877411], STX1A (syntaxin 1A) [NCBI Gene 6804] {aka HPC-1, P35-1, STX1, SYN1A}
- **Diseases:** MDR (MESH:D018088), gastrointestinal symptoms (MESH:D012817), AMR (MESH:D060467), diarrhea (MESH:D003967), malnutrition (MESH:D044342), gut (MESH:C536735), Diarrheal diseases (MESH:D004403), DAEC (MESH:D004927), HUS (MESH:D006463), infection (MESH:D007239), gastrointestinal disorders (MESH:D005767), immune system diseases (MESH:D007154), enteric and extra-enteric infections (MESH:D004751), watery diarrhea (MESH:D003969), gyration (MESH:D015799), secretory diarrhea (MESH:C564382), infectious (MESH:D003141)
- **Chemicals:** PRO (MESH:D011392), agar (MESH:D000362), ciprofloxacin (MESH:D002939), Lactose (MESH:D007785), sulfonamides (MESH:D013449), carbapenems (MESH:D015780), meropenem (MESH:D000077731), water (MESH:D014867), erythromycin (MESH:D004917), gentamicin (MESH:D005839), glycerol (MESH:D005990), penicillins (MESH:D010406), tetracycline (MESH:D013752), MAZ11 (-), fluoroquinolones (MESH:D024841), SER (MESH:D012694), propolis (MESH:D011429), ceftriaxone (MESH:D002443), macrolides (MESH:D018942), trimethoprim-sulfamethoxazole (MESH:D015662), Agarose (MESH:D012685), methyl red (MESH:C008492), imipenem (MESH:D015378), ampicillin (MESH:D000667), citrate (MESH:D019343), indole (MESH:C030374), Denbinobin (MESH:C436544), LYS (MESH:D008239), beta-lactam (MESH:D047090), hydrogen (MESH:D006859)
- **Species:** Escherichia coli (E. coli, species) [taxon 562], Salmonella enterica subsp. enterica serovar Typhi (no rank) [taxon 90370], Pseudomonas aeruginosa (species) [taxon 287], Homo sapiens (human, species) [taxon 9606], Staphylococcus aureus (species) [taxon 1280], Ehrlichia sp. IE-C (species) [taxon 371764], Shigella (genus) [taxon 620]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12927950/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12927950/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/PMC12927950/full.md

---
Source: https://tomesphere.com/paper/PMC12927950