State-aware protein-ligand complex prediction using AlphaFold3 with purified sequences
Enming Xing, Junjie Zhang, Shen Wang, Xiaolin Cheng

TL;DR
This paper introduces a state-aware prediction method using purified sequences to improve protein-ligand complex modeling with AlphaFold3, addressing limitations in predicting novel chemotypes and conformational changes.
Contribution
The authors developed a state-aware strategy leveraging AF-ClaSeq to select sequence subsets encoding specific structural states, enhancing ligand pose prediction accuracy.
Findings
Significant improvement in ligand pose prediction accuracy.
Corrected previous AlphaFold3 failures by selecting relevant functional states.
Demonstrated broad applicability of the approach across molecular modeling tasks.
Abstract
Deep learning-based prediction of protein-ligand complexes has advanced significantly with the development of architectures such as AlphaFold3, Boltz-1, Chai-1, Protenix, and NeuralPlexer. Multiple sequence alignment (MSA) has been a key input, providing coevolutionary information critical for structural inference. However, recent benchmarks reveal a major limitation: these models often memorize ligand poses from training data and perform poorly on novel chemotypes or dynamic binding events involving substantial conformational changes in binding pockets. To overcome this, we introduced a state-aware protein-ligand prediction strategy leveraging purified sequence subsets generated by AF-ClaSeq - a method previously developed by our group. AF-ClaSeq isolates coevolutionary signals and selects sequences that preferentially encode distinct structural states as predicted by AlphaFold2. By…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Protein Structure and Dynamics · Machine Learning in Bioinformatics
