Distributed Training of Embeddings using Graph Analytics
Gurbinder Gill (1), Roshan Dathathri (1), Saeed Maleki (2), Madan, Musuvathi (2), Todd Mytkowicz (2), Olli Saarikivi (2) ((1) The University of, Texas at Austin, (2) Microsoft Research)

TL;DR
This paper introduces GraphAny2Vec, a distributed framework for training embeddings like Word2Vec and Node2Vec using graph analytics, achieving high accuracy and significant speedups over existing methods.
Contribution
It formulates embedding training as a graph problem, adapts a distributed graph framework, and introduces a novel gradient combination method for improved accuracy and efficiency.
Findings
Matches state-of-the-art accuracy on 32-host clusters
Achieves 12x and 5x speedup over existing methods
Improves DMTK accuracy by over 30% with Gradient Combiner
Abstract
Many applications today, such as NLP, network analysis, and code analysis, rely on semantically embedding objects into low-dimensional fixed-length vectors. Such embeddings naturally provide a way to perform useful downstream tasks, such as identifying relations among objects or predicting objects for a given context, etc. Unfortunately, the training necessary for accurate embeddings is usually computationally intensive and requires processing large amounts of data. Furthermore, distributing this training is challenging. Most embedding training uses stochastic gradient descent (SGD), an "inherently" sequential algorithm. Prior approaches to parallelizing SGD do not honor these dependencies and thus potentially suffer poor convergence. This paper presents a distributed training framework for a class of applications that use Skip-gram-like models to generate embeddings. We call this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDeepWalk · node2vec · Stochastic Gradient Descent
