ProtDBench: A Unified Benchmark of Protein Binder Design and Evaluation
Cong Liu, Milong Ren, Jiaqi Guan, Chengyue Gong, Jinyuan Sun, Xinshi Chen, Wenzhi Xiao

TL;DR
ProtDBench introduces a standardized, throughput-aware evaluation framework for protein binder design, enabling fair comparison and analysis of different methods under realistic conditions.
Contribution
It provides a unified benchmark with standardized protocols, success criteria, and throughput-aware metrics, addressing variability and bias in current evaluation practices.
Findings
Verifier-dependent bias affects evaluation outcomes.
Benchmarking reveals differences between design methods.
Throughput-aware metrics balance success rate and efficiency.
Abstract
Recent advances in de novo protein binder design have enabled increasing experimental validation, yet reported in silico metrics remain difficult to interpret or compare across studies due to non-standardized evaluation protocols. We introduce ProtDBench, a standardized and throughput-aware evaluation framework for protein binder design. ProtDBench defines unified benchmark tasks, evaluation protocols, and success criteria, enabling systematic analysis of how evaluation design influences observed performance. Using a large wet-lab annotated dataset, we analyze commonly used structure prediction models as evaluation verifiers, revealing substantial verifier-dependent bias and limited agreement under identical filtering protocols. We then benchmark representative open-source generative binder design methods across ten diverse protein targets under a fixed evaluation protocol. Beyond…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
