Boosting AlphaFold Protein Tertiary Structure Prediction through MSA Engineering and Extensive Model Sampling and Ranking in CASP16
Jian Liu, Pawan Neupane, Jianlin Cheng

TL;DR
This paper introduces MULTICOM4, a system that improves AlphaFold's protein structure predictions by using better sequence alignments and extensive model sampling.
Contribution
The novel approach combines MSA engineering, model sampling, and ensemble QA methods to enhance AlphaFold predictions.
Findings
MULTICOM4 achieved a TM-score of 0.902 for top-1 predictions in CASP16.
It predicted correct folds for 97.6% of domains with top-1 predictions.
Using multiple QA methods and model clustering improved ranking robustness.
Abstract
AlphaFold2 and AlphaFold3 have revolutionized protein structure prediction by enabling high-accuracy tertiary structure predictions for most single-chain proteins (monomers). However, obtaining high-quality predictions for some hard protein targets with shallow or noisy multiple sequence alignments (MSAs) and complicated multi-domain architectures remains challenging. Here, we present MULTICOM4, an integrative protein structure prediction system that uses diverse MSA generation, large-scale model sampling, and an ensemble model quality assessment (QA) strategy of combining individual QA methods to improve model generation and ranking of AlphaFold2 and AlphaFold3. In the 16th Critical Assessment of Techniques for Protein Structure Prediction (CASP16), our predictors built on MULTICOM4 ranked among the top performers out of 120 predictors in tertiary structure prediction and outperformed…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics
