SeedFold: Scaling Biomolecular Structure Prediction
Yi Zhou, Chan Lu, Yiming Ma, Wei Qu, Fei Ye, Kexin Zhang, Lan Wang, Minrui Gui, Quanquan Gu

TL;DR
SeedFold is a scalable biomolecular structure prediction model that employs innovative scaling strategies, efficient attention mechanisms, and large-scale training data to outperform existing models like AlphaFold3.
Contribution
The paper introduces a novel width-scaling strategy, a linear triangular attention mechanism, and a large-scale distillation dataset for improved biomolecular structure prediction.
Findings
SeedFold outperforms AlphaFold3 on most protein-related tasks.
Effective width-scaling strategy enhances model capacity.
Linear triangular attention reduces computational complexity.
Abstract
Highly accurate biomolecular structure prediction is a key component of developing biomolecular foundation models, and one of the most critical aspects of building foundation models is identifying the recipes for scaling the model. In this work, we present SeedFold, a folding model that successfully scales up the model capacity. Our contributions are threefold: first, we identify an effective width-scaling strategy for the Pairformer to increase representation capacity; second, we introduce a novel linear triangular attention that reduces computational complexity to enable efficient scaling; finally, we construct a large-scale distillation dataset to substantially enlarge the training set. Experiments on FoldBench show that SeedFold outperforms AlphaFold3 on most protein-related tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · Protein Structure and Dynamics · Bioinformatics and Genomic Networks
