# Hybrid Dual-Context Prompted Cross-Attention Framework with Language Model Guidance for Multi-Label Prediction of Human Off-Target Ligand–Protein Interactions

**Authors:** Abdullah, Zulaikha Fatima, Muhammad Ateeb Ather, Liliana Chanona-Hernandez, José Luis Oropeza Rodríguez

PMC · DOI: 10.3390/ijms27021126 · International Journal of Molecular Sciences · 2026-01-22

## TL;DR

This paper introduces HDPC-LGT, a new AI model that improves the prediction of drug off-target effects by combining chemical and protein data, helping reduce drug toxicity.

## Contribution

HDPC-LGT is a novel framework that integrates graph-based chemical reasoning with protein language model embeddings for multi-label prediction of ligand–protein interactions.

## Key findings

- HDPC-LGT outperforms existing models with a macro ROC–AUC of 0.996 and micro F1-score of 0.989.
- The model generalizes well to unseen data, validated on Papyrus, PDBbind, and Yamanishi datasets.
- It provides interpretable outputs highlighting key biochemical interaction regions like aaRSs and ribosomal tunnels.

## Abstract

Accurately identifying drug off-targets is essential for reducing toxicity and improving the success rate of pharmaceutical discovery pipelines. However, current deep learning approaches often struggle to fuse chemical structure, protein biology, and multi-target context. Here, we introduce HDPC-LGT (Hybrid Dual-Prompt Cross-Attention Ligand–Protein Graph Transformer), a framework designed to predict ligand binding across sixteen human translation-related proteins clinically associated with antibiotic toxicity. HDPC-LGT combines graph-based chemical reasoning with protein language model embeddings and structural priors to capture biologically meaningful ligand–protein interactions. The model was trained on 216,482 experimentally validated ligand–protein pairs from the Chemical Database of Bioactive Molecules (ChEMBL) and the Protein–Ligand Binding Database (BindingDB) and evaluated using scaffold-level, protein-level, and combined holdout strategies. HDPC-LGT achieves a macro receiver operating characteristic–area under the curve (macro ROC–AUC) of 0.996 and a micro F1-score (micro F1) of 0.989, outperforming Deep Drug–Target Affinity Model (DeepDTA), Graph-based Drug–Target Affinity Model (GraphDTA), Molecule–Protein Interaction Transformer (MolTrans), Cross-Attention Transformer for Drug–Target Interaction (CAT–DTI), and Heterogeneous Graph Transformer for Drug–Target Affinity (HGT–DTA) by 3–7%. External validation using the Papyrus universal bioactivity resource (Papyrus), the Protein Data Bank binding subset (PDBbind), and the benchmark Yamanishi dataset confirms strong generalisation to unseen chemotypes and proteins. HDPC-LGT also provides biologically interpretable outputs: cross-attention maps, Integrated Gradients (IG), and Gradient-weighted Class Activation Mapping (Grad-CAM) highlight catalytic residues in aminoacyl-tRNA synthetases (aaRSs), ribosomal tunnel regions, and pharmacophoric interaction patterns, aligning with known biochemical mechanisms. By integrating multimodal biochemical information with deep learning, HDPC-LGT offers a practical tool for off-target toxicity prediction, structure-based lead optimisation, and polypharmacology research, with potential applications in antibiotic development, safety profiling, and rational compound redesign.

## Full-text entities

- **Diseases:** toxicity (MESH:D064420)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12842375/full.md

## Figures

14 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12842375/full.md

## References

62 references — full list in the complete paper: https://tomesphere.com/paper/PMC12842375/full.md

---
Source: https://tomesphere.com/paper/PMC12842375