Grale: Designing Networks for Graph Learning
Jonathan Halcrow, Alexandru Mo\c{s}oi, Sam Ruth, Bryan Perozzi

TL;DR
Grale is a scalable graph design method that fuses multiple similarity measures to create task-specific graphs for large-scale semi-supervised learning, significantly improving performance in industrial applications.
Contribution
We introduce Grale, a novel scalable approach for designing task-specific graphs from multiple similarity measures on datasets with billions of nodes.
Findings
Grale successfully deployed on datasets with billions of nodes.
Increases recall by 89% in abuse detection on YouTube.
Reduces graph construction time from weeks to hours.
Abstract
How can we find the right graph for semi-supervised learning? In real world applications, the choice of which edges to use for computation is the first step in any graph learning process. Interestingly, there are often many types of similarity available to choose as the edges between nodes, and the choice of edges can drastically affect the performance of downstream semi-supervised learning systems. However, despite the importance of graph design, most of the literature assumes that the graph is static. In this work, we present Grale, a scalable method we have developed to address the problem of graph design for graphs with billions of nodes. Grale operates by fusing together different measures of(potentially weak) similarity to create a graph which exhibits high task-specific homophily between its nodes. Grale is designed for running on large datasets. We have deployed Grale in more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
