ToxBench: A Binding Affinity Prediction Benchmark with AB-FEP-Calculated Labels for Human Estrogen Receptor Alpha

Meng Liu; Karl Leswing; Simon K. S. Chu; Farhad Ramezanghorbani; Griffin Young; Gabriel Marques; Prerna Das; Anjali Panikar; Esther Jamir; Mohammed Sulaiman Shamsudeen; K. Shawn Watts; Ananya Sen; Hari Priya Devannagari; Edward B. Miller; Muyun Lihan; Howook Hwang; Janet Paulsen; Xin Yu; Kyle Gion; Timur Rvachov; Emine Kucukbenli; Saee Gopal Paliwal

arXiv:2507.08966·cs.LG·July 15, 2025

ToxBench: A Binding Affinity Prediction Benchmark with AB-FEP-Calculated Labels for Human Estrogen Receptor Alpha

Meng Liu, Karl Leswing, Simon K. S. Chu, Farhad Ramezanghorbani, Griffin Young, Gabriel Marques, Prerna Das, Anjali Panikar, Esther Jamir, Mohammed Sulaiman Shamsudeen, K. Shawn Watts, Ananya Sen, Hari Priya Devannagari, Edward B. Miller, Muyun Lihan, Howook Hwang, Janet Paulsen

PDF

1 Models

TL;DR

This paper introduces ToxBench, a large-scale dataset of AB-FEP computed binding affinities for human estrogen receptor alpha, enabling the development and benchmarking of machine learning models that approximate high-accuracy physics-based predictions efficiently.

Contribution

The paper presents ToxBench, the first large-scale AB-FEP dataset for ML development on ERα, and introduces DualBind, a novel ML model with a dual-loss framework for improved binding energy prediction.

Findings

01

DualBind outperforms existing ML models in benchmark tests.

02

ML models can approximate AB-FEP predictions with high accuracy.

03

ToxBench enables effective ML training and evaluation for binding affinity prediction.

Abstract

Protein-ligand binding affinity prediction is essential for drug discovery and toxicity assessment. While machine learning (ML) promises fast and accurate predictions, its progress is constrained by the availability of reliable data. In contrast, physics-based methods such as absolute binding free energy perturbation (AB-FEP) deliver high accuracy but are computationally prohibitive for high-throughput applications. To bridge this gap, we introduce ToxBench, the first large-scale AB-FEP dataset designed for ML development and focused on a single pharmaceutically critical target, Human Estrogen Receptor Alpha (ER $α$ ). ToxBench contains 8,770 ER $α$ -ligand complex structures with binding free energies computed via AB-FEP with a subset validated against experimental affinities at 1.75 kcal/mol RMSE, along with non-overlapping ligand splits to assess model generalizability. Using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
nvidia/NV-DualBind-1M-v1
model· 8 dl· ♡ 4
8 dl♡ 4

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.