Distributed-Memory Vertex-Centric Network Embedding for Large-Scale Graphs
Sara Riazi, Boyana Norris

TL;DR
This paper introduces a scalable distributed-memory network embedding method using Apache Spark and GraphX, capable of handling large-scale graphs for tasks like vertex classification and link prediction.
Contribution
The paper presents a novel distributed-memory approach for network embedding that scales to very large graphs, overcoming limitations of existing methods.
Findings
Successfully scales to large graphs with over a million edges
Produces meaningful embeddings for classification and prediction tasks
Demonstrates effectiveness on real-world and synthetic data
Abstract
Network embedding is an important step in many different computations based on graph data. However, existing approaches are limited to small or middle size graphs with fewer than a million edges. In practice, web or social network graphs are orders of magnitude larger, thus making most current methods impractical for very large graphs. To address this problem, we introduce a new distributed-memory parallel network embedding method based on Apache Spark and GraphX. We demonstrate the scalability of our method as well as its ability to generate meaningful embeddings for vertex classification and link prediction on both real-world and synthetic graphs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Complex Network Analysis Techniques · Graph Theory and Algorithms
