MegaFold: System-Level Optimizations for Accelerating Protein Structure Prediction Models
Hoa La, Ahan Gupta, Alex Morehead, Jianlin Cheng, Minjia Zhang

TL;DR
MegaFold is a system-level optimization framework that accelerates protein structure prediction models like AlphaFold3 by reducing memory usage and training time, enabling longer sequence training and better scalability across different hardware platforms.
Contribution
MegaFold introduces cross-platform system optimizations including caching, memory-efficient kernels, and operator fusion to significantly enhance the training efficiency of AlphaFold3.
Findings
Reduces peak memory usage by up to 1.23×
Speeds up training iteration by up to 1.73× on NVIDIA GPUs
Enables training on 1.35× longer sequences without out-of-memory errors
Abstract
Protein structure prediction models such as AlphaFold3 (AF3) push the frontier of biomolecular modeling by incorporating science-informed architectural changes to the transformer architecture. However, these advances come at a steep system cost, introducing: compute- and memory-intensive operators, 2D attention mechanisms, and retrieval-augmented data pipelines, which collectively hinder the scalability of AF3 training. In this work, we present MegaFold, a cross-platform system to accelerate AF3 training. MegaFold tackles key bottlenecks through ahead-of-time caching to eliminate GPU idle time from the retrieval-augmented data pipeline, Triton-based kernels for memory-efficient EvoAttention on heterogeneous devices, and deep fusion for common and critical small operators in AF3. Evaluation on both NVIDIA H200 and AMD MI250 GPUs shows that MegaFold reduces peak memory usage of AF3…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · Microbial Metabolic Engineering and Bioproduction · Machine Learning in Bioinformatics
