SeedFold: Scaling Biomolecular Structure Prediction

Yi Zhou; Chan Lu; Yiming Ma; Wei Qu; Fei Ye; Kexin Zhang; Lan Wang; Minrui Gui; Quanquan Gu

arXiv:2512.24354·q-bio.BM·January 2, 2026

SeedFold: Scaling Biomolecular Structure Prediction

Yi Zhou, Chan Lu, Yiming Ma, Wei Qu, Fei Ye, Kexin Zhang, Lan Wang, Minrui Gui, Quanquan Gu

PDF

Open Access

TL;DR

SeedFold is a scalable biomolecular structure prediction model that employs innovative scaling strategies, efficient attention mechanisms, and large-scale training data to outperform existing models like AlphaFold3.

Contribution

The paper introduces a novel width-scaling strategy, a linear triangular attention mechanism, and a large-scale distillation dataset for improved biomolecular structure prediction.

Findings

01

SeedFold outperforms AlphaFold3 on most protein-related tasks.

02

Effective width-scaling strategy enhances model capacity.

03

Linear triangular attention reduces computational complexity.

Abstract

Highly accurate biomolecular structure prediction is a key component of developing biomolecular foundation models, and one of the most critical aspects of building foundation models is identifying the recipes for scaling the model. In this work, we present SeedFold, a folding model that successfully scales up the model capacity. Our contributions are threefold: first, we identify an effective width-scaling strategy for the Pairformer to increase representation capacity; second, we introduce a novel linear triangular attention that reduces computational complexity to enable efficient scaling; finally, we construct a large-scale distillation dataset to substantially enlarge the training set. Experiments on FoldBench show that SeedFold outperforms AlphaFold3 on most protein-related tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Bioinformatics · Protein Structure and Dynamics · Bioinformatics and Genomic Networks