# Precision Enhanced Bioactivity Prediction of Tyrosine Kinase Inhibitors by Integrating Deep Learning and Molecular Fingerprints Towards Cost-Effective and Targeted Cancer Therapy

**Authors:** Fatma Hilal Yagin, Yasin Gormez, Cemil Colak, Abdulmohsen Algarni, Fahaid Al-Hashem, Luca Paolo Ardigò

PMC · DOI: 10.3390/ph18070975 · Pharmaceuticals · 2025-06-28

## TL;DR

This study uses machine learning and molecular data to predict the effectiveness of cancer drugs that target tyrosine kinases, aiming to speed up drug development and improve cancer treatment.

## Contribution

A novel machine learning framework integrating deep learning and molecular fingerprints to enhance bioactivity prediction of tyrosine kinase inhibitors.

## Key findings

- SVM achieved the highest F1-score (87.9%) and accuracy (85.1%) in predicting TKI bioactivity.
- Morgan fingerprints significantly improved model performance by enhancing structural feature recognition.
- The framework supports efficient compound selection and reduces experimental costs in drug development.

## Abstract

Background and Objective: Dysregulated tyrosine kinase signaling is a central driver of tumorigenesis, metastasis, and therapeutic resistance. While tyrosine kinase inhibitors (TKIs) have revolutionized targeted cancer treatment, identifying compounds with optimal bioactivity remains a critical bottleneck. This study presents a robust machine learning framework—leveraging deep artificial neural networks (dANNs), convolutional neural networks (CNNs), and structural molecular fingerprints—to accurately predict TKI bioactivity, ultimately accelerating the preclinical phase of drug development. Methods: A curated dataset of 28,314 small molecules from the ChEMBL database targeting 11 tyrosine kinases was analyzed. Using Morgan fingerprints and physicochemical descriptors (e.g., molecular weight, LogP, hydrogen bonding), ten supervised models, including dANN, SVM, CatBoost, and CNN, were trained and optimized through a randomized hyperparameter search. Model performance was evaluated using F1-score, ROC–AUC, precision–recall curves, and log loss. Results: SVM achieved the highest F1-score (87.9%) and accuracy (85.1%), while dANNs yielded the lowest log loss (0.25096), indicating superior probabilistic reliability. CatBoost excelled in ROC–AUC and precision–recall metrics. The integration of Morgan fingerprints significantly improved bioactivity prediction across all models by enhancing structural feature recognition. Conclusions: This work highlights the transformative role of machine learning—particularly dANNs and SVM—in rational drug discovery. By enabling accurate bioactivity prediction, our model pipeline can effectively reduce experimental burden, optimize compound selection, and support personalized cancer treatment design. The proposed framework advances kinase inhibitor screening pipelines and provides a scalable foundation for translational applications in precision oncology. By enabling early identification of bioactive compounds with favorable pharmacological profiles, the results of this study may support more efficient candidate selection for clinical drug development, particularly in regards to cancer therapy and kinase-associated disorders.

## Linked entities

- **Diseases:** cancer (MONDO:0004992)

## Full-text entities

- **Genes:** TXK (TXK tyrosine kinase) [NCBI Gene 7294] {aka BTKL, PSCTK5, PTK4, RLK, TKL}
- **Diseases:** Cancer (MESH:D009369), tumorigenesis (MESH:D063646), metastasis (MESH:D009362)
- **Chemicals:** hydrogen (MESH:D006859)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12298502/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12298502/full.md

## References

48 references — full list in the complete paper: https://tomesphere.com/paper/PMC12298502/full.md

---
Source: https://tomesphere.com/paper/PMC12298502