# A Proteochemometric Model for Ligands of the SLC5 Transporter Family

**Authors:** Martin Juhás, Gerhard Ecker

PMC · DOI: 10.1002/ardp.70183 · Archiv Der Pharmazie · 2026-01-16

## TL;DR

This paper develops a model to predict drug interactions with the SLC5 transporter family, identifying a key amino acid involved in drug selectivity.

## Contribution

A novel proteochemometric model was developed to identify a new key residue for drug selectivity in the SLC5 transporter family.

## Key findings

- A proteochemometric model achieved high accuracy in predicting SLC5 inhibitor activity and selectivity.
- Leu286 in hSLGT2 was identified as a crucial residue for selectivity in SLC5 inhibitors.
- The model successfully predicted the effect of single-point mutations on drug binding affinity.

## Abstract

The SLC5 family of solute carriers is of significant interest for drug development due to its role in many disease processes. Building on the recent elucidation of SGLT2's structure, we developed a proteochemometric model for SLC5 inhibitors in order to gain information on selectivity‐driving amino acids in the binding site. Ensemble‐based algorithms, namely random forest (RF) and gradient‐boosted trees, proved the best suited for the task reaching high accuracy in both activity and selectivity predictions with Morgan circular fingerprints and Z‐scales for ligand and protein features, respectively. Inclusion of protein sequence as input parameters for the PCM modeling allowed identification of Leu286 in hSLGT2 as a new potential key binding site residue crucial for selectivity. Furthermore, the PCM model also performed well in predicting the effect of single‐point mutations at hSGLT2 on the binding affinity of empagliflozin. The obtained models are available in the form of a Jupyter notebook.

A novel proteochemometric model was developed to predict the activity and selectivity of SLC5 family inhibitors. The model's performance was validated using 10‐fold cross‐validation on publicly available datasets achieving Q
2 = 0.79 and MSE < 0.32. Detailed analysis of the results enabled the identification of a new, previously unrecognized selectivity‐determining residue in the SLC5 family, offering novel insights for SLC5‐targeted drug design.

## Linked entities

- **Proteins:** SLC5A2 (solute carrier family 5 member 2)
- **Chemicals:** empagliflozin (PubChem CID 11949646)

## Full-text entities

- **Genes:** SLC5A2 (solute carrier family 5 member 2) [NCBI Gene 6524] {aka SGLT2}
- **Chemicals:** empagliflozin (MESH:C570240)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12809628/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12809628/full.md

## References

35 references — full list in the complete paper: https://tomesphere.com/paper/PMC12809628/full.md

---
Source: https://tomesphere.com/paper/PMC12809628