Billion-Scale Graph Foundation Models
Maya Bechler-Speicher, Yoel Gottlieb, Andrey Isakov, David Abensur, Ami Tavory, Daniel Haimovich, Ido Guy, Udi Weinsberg

TL;DR
This paper introduces GraphBFF, a scalable framework for building billion-parameter Graph Foundation Models that effectively learn from large-scale heterogeneous graphs, demonstrating superior performance across diverse tasks.
Contribution
The paper presents a novel scalable architecture and methodology for training billion-parameter GFMs on large heterogeneous graphs, including neural scaling laws and practical training recipes.
Findings
Loss decreases predictably with model and data scale.
GraphBFF outperforms baselines by up to 31 PRAUC points.
Effective across ten diverse downstream graph tasks.
Abstract
Graph-structured data underpins many critical applications. While foundation models have transformed language and vision via large-scale pretraining and lightweight adaptation, extending this paradigm to general, real-world graphs is challenging. In this work, we present Graph Billion-Foundation-Fusion (GraphBFF): an end-to-end recipe for building billion-parameter Graph Foundation Models (GFMs) for large-scale heterogeneous graphs. Central to the recipe is the GraphBFF Transformer, a flexible and scalable architecture designed for practical billion-scale GFMs. Using the GraphBFF, we present neural scaling laws for heterogeneous graphs and show that loss decreases predictably as either model capacity or training data scales, depending on which factor is the bottleneck. The GraphBFF framework provides concrete methodologies for data batching, pretraining, and fine-tuning for building…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Graph Theory and Algorithms · Multimodal Machine Learning Applications
