Pearl: A Foundation Model for Placing Every Atom in the Right Location

Genesis Research Team: Alejandro Dobles; Nina Jovic; Kenneth Leidal; Pranav Murugan; David C. Williams; Drausin Wulsin; Nate Gruver; Christina X. Ji; Korrawat Pruegsanusak; Gianluca Scarpellini; Ansh Sharma; Wojciech Swiderski; Andrea Bootsma; Richard Strong Bowen; Charlotte Chen; Jamin Chen; Marc Andr\'e D\"amgen; Benjamin DiFrancesco; J. D. Fishman; Alla Ivanova; Zach Kagin; David Li-Bland; Zuli Liu; Igor Morozov; Jeffrey Ouyang-Zhang; Frank C. Pickard IV; Kushal S. Shah; Ben Shor; Gabriel Monteiro da Silva; Roy Tal; Maxx Tessmer; Carl Tilbury; Cyr Vetcher; Daniel Zeng; Maruan Al-Shedivat; Aleksandra Faust; Evan N. Feinberg; Michael V. LeVine; Matteus Pan

arXiv:2510.24670·cs.LG·October 30, 2025

Pearl: A Foundation Model for Placing Every Atom in the Right Location

Genesis Research Team: Alejandro Dobles, Nina Jovic, Kenneth Leidal, Pranav Murugan, David C. Williams, Drausin Wulsin, Nate Gruver, Christina X. Ji, Korrawat Pruegsanusak, Gianluca Scarpellini, Ansh Sharma, Wojciech Swiderski, Andrea Bootsma, Richard Strong Bowen

PDF

TL;DR

Pearl is a new foundation model for protein-ligand structure prediction that leverages synthetic data, SO(3)-equivariant architectures, and controllable inference to achieve state-of-the-art accuracy and generalization in drug discovery applications.

Contribution

The paper introduces Pearl, a novel model combining synthetic data, equivariant diffusion architecture, and flexible inference for improved protein-ligand cofolding predictions.

Findings

01

Outperforms AlphaFold 3 and baselines on public benchmarks with 14%+ improvements.

02

Achieves 3.6x better results on challenging real-world drug targets.

03

Model performance scales with the size of synthetic training data.

Abstract

Accurately predicting the three-dimensional structures of protein-ligand complexes remains a fundamental challenge in computational drug discovery that limits the pace and success of therapeutic design. Deep learning methods have recently shown strong potential as structural prediction tools, achieving promising accuracy across diverse biomolecular systems. However, their performance and utility are constrained by scarce experimental data, inefficient architectures, physically invalid poses, and the limited ability to exploit auxiliary information available at inference. To address these issues, we introduce Pearl (Placing Every Atom in the Right Location), a foundation model for protein-ligand cofolding at scale. Pearl addresses these challenges with three key innovations: (1) training recipes that include large-scale synthetic data to overcome data scarcity; (2) architectures that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.