Hermes: Large DEL Datasets Train Generalizable Protein-Ligand Binding Prediction Models
Maxwell Kleinsasser, Brayden J. Halverson, Edward Kraft, Sean Francis-Lyon, Sarah E. Hugo, Mackenzie R. Roman, Ben Miller, Andrew D. Blevins, Ian K. Quigley

TL;DR
Hermes is a transformer trained solely on large, diverse DEL datasets that effectively predicts protein-ligand interactions, generalizing well to new targets and chemical structures without traditional affinity data.
Contribution
This work introduces Hermes, a lightweight transformer trained exclusively on large DEL datasets, demonstrating its ability to generalize in protein-ligand binding prediction without traditional affinity measurements.
Findings
Hermes generalizes to unseen targets and chemical scaffolds.
DEL data alone captures transferable interaction representations.
Hermes enables fast inference suitable for virtual screening.
Abstract
The quality and consistency of training data remain critical bottlenecks for protein-ligand binding prediction. Public affinity datasets, aggregated from thousands of labs and assay formats, introduce biases that limit model generalization and complicate evaluation. DNA-encoded chemical libraries (DELs) offer a potential solution: unified experimental protocols generating massive binding datasets across diverse chemical and protein target space. We present Hermes, a lightweight transformer trained exclusively on DEL data from screens against hundreds of protein targets, representing one of the largest and most protein-diverse DEL training sets applied to protein-ligand interaction (PLI) modeling to date. Despite never seeing traditional affinity measurements during training, Hermes generalizes to held-out targets, novel chemical scaffolds, and external benchmarks derived from public…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Protein Structure and Dynamics · Chemical Synthesis and Analysis
