Pearl: A Foundation Model for Placing Every Atom in the Right Location
Genesis Research Team: Alejandro Dobles, Nina Jovic, Kenneth Leidal, Pranav Murugan, David C. Williams, Drausin Wulsin, Nate Gruver, Christina X. Ji, Korrawat Pruegsanusak, Gianluca Scarpellini, Ansh Sharma, Wojciech Swiderski, Andrea Bootsma, Richard Strong Bowen

TL;DR
Pearl is a new foundation model for protein-ligand structure prediction that leverages synthetic data, SO(3)-equivariant architectures, and controllable inference to achieve state-of-the-art accuracy and generalization in drug discovery applications.
Contribution
The paper introduces Pearl, a novel model combining synthetic data, equivariant diffusion architecture, and flexible inference for improved protein-ligand cofolding predictions.
Findings
Outperforms AlphaFold 3 and baselines on public benchmarks with 14%+ improvements.
Achieves 3.6x better results on challenging real-world drug targets.
Model performance scales with the size of synthetic training data.
Abstract
Accurately predicting the three-dimensional structures of protein-ligand complexes remains a fundamental challenge in computational drug discovery that limits the pace and success of therapeutic design. Deep learning methods have recently shown strong potential as structural prediction tools, achieving promising accuracy across diverse biomolecular systems. However, their performance and utility are constrained by scarce experimental data, inefficient architectures, physically invalid poses, and the limited ability to exploit auxiliary information available at inference. To address these issues, we introduce Pearl (Placing Every Atom in the Right Location), a foundation model for protein-ligand cofolding at scale. Pearl addresses these challenges with three key innovations: (1) training recipes that include large-scale synthetic data to overcome data scarcity; (2) architectures that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
