Accelerating supercritical pharmaceutical formulation via interpretable data-driven prediction of drug solubility
El-Sayed Khafagy, Amr Selim Abu Lila, Mahboubeh Pishnamazi

TL;DR
This paper introduces a machine learning framework to predict drug solubility in supercritical CO2, speeding up pharmaceutical formulation development.
Contribution
The novel contribution is an interpretable data-driven framework for drug solubility prediction in supercritical CO2 with mechanistic insights.
Findings
Machine learning models like Extreme Gradient Boosting and Support Vector Regression improve solubility prediction accuracy.
Sensitivity and amplitude-based analyses reveal key molecular and process factors affecting solubility.
The framework provides actionable insights for drug selection and supercritical processing design.
Abstract
Drug solubility in supercritical carbon dioxide (SC-CO2) plays a pivotal role in the development of particle engineering, drug loading, and solvent-free pharmaceutical formulations. However, experimental solubility determination in supercritical systems remains costly, time-consuming, and compound-specific. In this study, an interpretable data-driven framework is proposed to support pharmaceutical formulation scientists by accurately predicting drug solubility in SC-CO2 while elucidating the governing physicochemical factors. Multiple machine learning regressors, including Extreme Gradient Boosting and Support Vector Regression, were developed and further integrated into an ensemble strategy to enhance robustness and generalizability. Model performance was systematically optimized using bio-inspired metaheuristic algorithms, enabling efficient hyperparameter selection across complex,…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 10
Figure 11
Figure 12
Figure 13
Figure 13
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 15Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPhase Equilibria and Thermodynamics · Computational Drug Discovery Methods · Machine Learning in Materials Science
