PyTorch-BigGraph: A Large-scale Graph Embedding System
Adam Lerer, Ledell Wu, Jiajun Shen, Timothee Lacroix, Luca Wehrstedt,, Abhijit Bose, Alex Peysakhovich

TL;DR
PyTorch-BigGraph is a scalable graph embedding system capable of handling graphs with billions of nodes and trillions of edges, enabling large-scale machine learning tasks.
Contribution
It introduces a graph partitioning approach within a distributed framework to scale traditional embedding methods to massive graphs.
Findings
Achieves comparable performance to existing systems on benchmarks.
Successfully embeds large social networks and Freebase dataset.
Supports training on single or multiple machines.
Abstract
Graph embedding methods produce unsupervised node features from graphs that can then be used for a variety of machine learning tasks. Modern graphs, particularly in industrial applications, contain billions of nodes and trillions of edges, which exceeds the capability of existing embedding systems. We present PyTorch-BigGraph (PBG), an embedding system that incorporates several modifications to traditional multi-relation embedding systems that allow it to scale to graphs with billions of nodes and trillions of edges. PBG uses graph partitioning to train arbitrarily large embeddings on either a single machine or in a distributed environment. We demonstrate comparable performance with existing embedding systems on common benchmarks, while allowing for scaling to arbitrarily large graphs and parallelization on multiple machines. We train and evaluate embeddings on several large social…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Caching and Content Delivery · Complex Network Analysis Techniques
