# Interpretable machine learning-driven QSAR modeling for coagulation factor X inhibitors: from molecular descriptors to predictive potency

**Authors:** Ali Onur Kaya

PMC · DOI: 10.1007/s10822-025-00758-2 · Journal of Computer-Aided Molecular Design · 2026-01-23

## TL;DR

This paper presents a machine learning model to predict and interpret the effectiveness of Factor Xa inhibitors, aiding in the design of better anticoagulant drugs.

## Contribution

The study introduces an interpretable QSAR framework combining predictive performance and mechanistic insights for FXa inhibitor design.

## Key findings

- ExtraTreesRegressor and XGBoostClassifier achieved high predictive accuracy for FXa inhibitory potency and classification.
- SHAP analysis identified electrostatic, topological, and polar surface descriptors as key contributors to FXa inhibition.
- Most compounds were within the model's reliable prediction space, ensuring robust virtual screening potential.

## Abstract

Inhibition of Coagulation Factor X (FXa) is a clinically validated therapeutic strategy; however, developing safer and more selective inhibitors remains a major challenge. In this study, we developed an interpretable machine learning–based QSAR framework to predict both the inhibitory potency and activity class of small molecules targeting FXa. A structurally curated dataset of 6400 compounds was retrieved from ChEMBL, standardized, and encoded using 391 non-redundant Mordred descriptors following systematic filtering. Benchmarking of 42 regression and 42 classification algorithms identified ExtraTreesRegressor and XGBoostClassifier as the most robust models. The regression model achieved an R2 of 0.760 and an RMSE of 0.831 on the independent test set, while the classification model reached an accuracy of 0.91 with balanced precision, recall, and an ROC-AUC of 0.962. SHAP (SHapley Additive exPlanations) analysis further enhanced interpretability by revealing that electrostatic, topological, and polar surface descriptors were the dominant contributors to FXa inhibitory potency. Applicability domain assessment using Williams plots confirmed that most compounds in both the training and test sets lay within the model’s reliable prediction space. Overall, the proposed QSAR pipeline integrates strong predictive performance with valuable mechanistic interpretability and rigorous validation, offering a practical computational tool for the virtual screening and rational design of novel FXa inhibitors.

## Linked entities

- **Proteins:** F10 (coagulation factor X)

## Full-text entities

- **Genes:** SHROOM4 (shroom family member 4) [NCBI Gene 57477] {aka MRXSSDS, SHAP, shrm4}, F10 (coagulation factor X) [NCBI Gene 2159] {aka FX, FXA}, F2 (coagulation factor II, thrombin) [NCBI Gene 2147] {aka PT, RPRGL2, THPH1}
- **Diseases:** thromboembolic disorders (MESH:D013923), stroke (MESH:D020521), Coagulation (MESH:D001778), deep vein thrombosis (MESH:D020246), pulmonary embolism (MESH:D011655), bleeding (MESH:D006470), gastrointestinal and intracranial hemorrhage (MESH:D006471), atrial fibrillation (MESH:D001281), renal impairment (MESH:D007674)
- **Chemicals:** rivaroxaban (MESH:D000069552), vitamin K (MESH:D014812), salts (MESH:D012492), DOACs (-), apixaban (MESH:C522181), hydrogen (MESH:D006859)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12827418/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12827418/full.md

## References

2 references — full list in the complete paper: https://tomesphere.com/paper/PMC12827418/full.md

---
Source: https://tomesphere.com/paper/PMC12827418