Towards Lightweight and Automated Representation Learning System for   Networks

Yuyang Xie; Jiezhong Qiu; Laxman Dhulipala; Wenjian Yu; Jie Tang,; Richard Peng; and Chi Wang

arXiv:2302.07084·cs.SI·February 15, 2023

Towards Lightweight and Automated Representation Learning System for Networks

Yuyang Xie, Jiezhong Qiu, Laxman Dhulipala, Wenjian Yu, Jie Tang,, Richard Peng, and Chi Wang

PDF

TL;DR

LIGHTNE 2.0 is a scalable, CPU-only network embedding system that achieves high quality and efficiency on massive graphs, outperforming distributed and GPU-based methods in speed and cost.

Contribution

The paper introduces LIGHTNE 2.0, a novel network embedding system that combines theoretical methods with new techniques to enable large-scale, high-quality embeddings on a single CPU machine.

Findings

01

Up to 84X faster than GraphVite

02

Can embed graphs with 124 billion edges in half an hour

03

Outperforms existing methods in speed and quality

Abstract

We propose LIGHTNE 2.0, a cost-effective, scalable, automated, and high-quality network embedding system that scales to graphs with hundreds of billions of edges on a single machine. In contrast to the mainstream belief that distributed architecture and GPUs are needed for large-scale network embedding with good quality, we prove that we can achieve higher quality, better scalability, lower cost, and faster runtime with shared-memory, CPU-only architecture. LIGHTNE 2.0 combines two theoretically grounded embedding methods NetSMF and ProNE. We introduce the following techniques to network embedding for the first time: (1) a newly proposed downsampling method to reduce the sample complexity of NetSMF while preserving its theoretical advantages; (2) a high-performance parallel graph processing stack GBBS to achieve high memory efficiency and scalability; (3) sparse parallel hash table to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.