# The Application of Machine Learning in Predicting the Permeability of Drugs Across the Blood Brain Barrier

**Authors:** Sogand Jafarpour, Maryam Asefzadeh, Ehsan Aboutaleb

PMC · DOI: 10.5812/ijpr-149367 · Iranian Journal of Pharmaceutical Research : IJPR · 2024-11-24

## TL;DR

This paper explores how machine learning can predict whether drugs can cross the blood-brain barrier, achieving high accuracy with a combined model.

## Contribution

A novel voting classifier model using Mordred chemical descriptors achieves high accuracy in predicting BBB permeability.

## Key findings

- The best model achieved an AUC of 0.96 using Mordred chemical descriptors.
- SHAP analysis identified the Lipinski rule of five as the most significant feature for BBB permeability prediction.
- The model's performance is consistent with prior studies on CNS drug permeability.

## Abstract

The inefficiency of some medications to cross the blood-brain barrier (BBB) is often attributed to their poor physicochemical or pharmacokinetic properties. Recent studies have demonstrated promising outcomes using machine learning algorithms to predict drug permeability across the BBB. In light of these findings, our study was conducted to explore the potential of machine learning in predicting the permeability of drugs across the BBB.

We utilized the B3DB dataset, a comprehensive BBB permeability molecular database, to build machine learning models. The dataset comprises 7,807 molecules, including information on their permeability, stereochemistry, and physicochemical properties. After preprocessing and cleaning, various machine learning algorithms were implemented using the Python library Pycaret to predict permeability.

The extra trees classifier model outperformed others when using Morgan fingerprints and Mordred chemical descriptors (MCDs), achieving an area under the curve (AUC) of 0.93 and 0.95 on the test dataset. Additionally, we conducted an experiment to train a voting classifier combining the top three performing models. The best-blended model, trained on MCDs, achieved an AUC of 0.96. Furthermore, Shapley additive exPlanations (SHAP) analysis was applied to our best-performing single model, the extra trees classifier trained on MCDs, identifying the Lipinski rule of five as the most significant feature in predicting BBB permeability.

In conclusion, our combined model trained on MCDs achieved an AUC of 0.96, an F1 Score of 0.91, and an MCC of 0.74. These results are consistent with prior studies on CNS drug permeability, highlighting the potential of machine learning in this domain.

## Full-text entities

- **Genes:** PGP (phosphoglycolate phosphatase) [NCBI Gene 283871] {aka AUM, G3PP, PGPase}, SHROOM4 (shroom family member 4) [NCBI Gene 57477] {aka MRXSSDS, SHAP, shrm4}, ABCB1 (ATP binding cassette subfamily B member 1) [NCBI Gene 5243] {aka ABC20, CD243, CLCS, ENPAT, GP170, MDR1}
- **Diseases:** GBM (MESH:D000141)
- **Chemicals:** lipid (MESH:D008055), water (MESH:D014867), octanol (MESH:D000442), hydrogen (MESH:D006859)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11892787/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11892787/full.md

## References

44 references — full list in the complete paper: https://tomesphere.com/paper/PMC11892787/full.md

---
Source: https://tomesphere.com/paper/PMC11892787