# DeepEGFR a graph neural network for bioactivity classification of EGFR inhibitors

**Authors:** Aijaz Ahmad Malik, Costerwell Khyriem, Sven Hauns, Imran Khan, Frederico G. Pinto, Azzat Al-Sadi, Rasheed Mohammad, Van Dinh Tran, Rolf Backofen, Nelson Soares, Mohammed Uddin, Omer S. Alkhnbashi

PMC · DOI: 10.1038/s41598-025-22126-8 · Scientific Reports · 2025-10-31

## TL;DR

DeepEGFR is a machine learning model that classifies compounds based on their ability to inhibit EGFR, a key target in cancer treatment, with high accuracy and potential for drug discovery.

## Contribution

The novel contribution is the development of DeepEGFR, a multi-class graph neural network that outperforms traditional methods in classifying EGFR inhibitors.

## Key findings

- DeepEGFR achieved approximately 94% F1-scores in classifying EGFR inhibitors into active, inactive, and intermediate categories.
- The model identified 300 underexplored compounds with potential to target EGFR, aiding drug discovery.
- Top features identified by DeepEGFR were validated against FDA-approved EGFR inhibitors, confirming biological relevance.

## Abstract

Epidermal Growth Factor Receptor (EGFR) plays a critical role in the development of several cancers. Thus, modulation/inhibition of EGFR activity is an appealing target of developing novel cancer therapeutics. With the advent of modern machine learning technologies, it is now possible to simulate interactions with high precision between EGFR and small molecules to predict inhibitory/ modulatory activity at an unprecedented scale. In this work, we propose a novel machine-learning method to fast and precise classification of small compounds that are active, intermediate or inactive in inhibiting/modulating EGFR activity. We developed DeepEGFR, a novel multi-class graph neural network (GNN) model, to classify compounds into Active, Inactive, and Intermediate functional categories. DeepEGFR leverages complementary molecular representations, combining SMILES strings and molecular fingerprint matrices (Klekota-Roth and PubChem) to capture both structural and property-based features of compounds. The model constructs an advanced molecular graph representing atom type, formal charge, bond type, and bond order, through nodes and edges. DeepEGFR achieved superior performance compared to baseline machine learning algorithms (e.g., SVM, Random Forest, ANN), with approximately 94% F1-scores across training and test datasets for all activity classes. To ensure interpretability, the top 20 features identified by DeepEGFR were validated against the five key characteristics of FDA-approved EGFR inhibitors (Afatinib, Gefitinib, Osimertinib, Dacomitinib, Erlotinib), confirming the biological relevance of the features. Moreover, DeepEGFR successfully identified 300 underexplored EGFR-targeting compounds, demonstrating its potential to accelerate the discovery of therapeutic agents. These results highlight the effectiveness of graph neural networks in advancing molecular activity classification, setting a potential new benchmark for EGFR inhibitor prediction. These findings demonstrate the DeepEGFR’s ability to highlight the promising EGFR inhibitors, that have received limited prior investigation, thereby supporting its role in facilitating the rational development of targeted therapies for precision oncology.

The online version contains supplementary material available at 10.1038/s41598-025-22126-8.

## Linked entities

- **Genes:** EGFR (epidermal growth factor receptor) [NCBI Gene 1956]
- **Chemicals:** Afatinib (PubChem CID 10184653), Gefitinib (PubChem CID 123631), Osimertinib (PubChem CID 71496458), Dacomitinib (PubChem CID 11511120), Erlotinib (PubChem CID 176870)

## Full-text entities

- **Genes:** EGFR (epidermal growth factor receptor) [NCBI Gene 1956] {aka ERBB, ERBB1, ERRP, HER1, NISBD2, NNCIS}
- **Diseases:** cancer (MESH:D009369)
- **Chemicals:** Gefitinib (MESH:D000077156), Afatinib (MESH:D000077716), Dacomitinib (MESH:C525726), Osimertinib (MESH:C000596361), Erlotinib (MESH:D000069347)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12578785/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12578785/full.md

## References

7 references — full list in the complete paper: https://tomesphere.com/paper/PMC12578785/full.md

---
Source: https://tomesphere.com/paper/PMC12578785