Distributed non-negative RESCAL with Automatic Model Selection for Exascale Data
Manish Bhattarai, Namita Kharat, Erik Skau, Benjamin Nebgen, Hristo, Djidjev, Sanjay Rajopadhye, James P. Smith, Boian Alexandrov

TL;DR
This paper introduces a distributed non-negative RESCAL algorithm with automatic model selection, enabling efficient analysis of extremely large relational datasets on heterogeneous architectures, demonstrated on real-world and synthetic data.
Contribution
The paper presents pyDRESCALk, a novel distributed non-negative RESCAL method with automatic model selection for large-scale knowledge graph embedding.
Findings
Near-linear scaling on large datasets
Successfully analyzed 11-terabyte dense tensor
Determined latent communities in 9-exabyte sparse tensor
Abstract
With the boom in the development of computer hardware and software, social media, IoT platforms, and communications, there has been an exponential growth in the volume of data produced around the world. Among these data, relational datasets are growing in popularity as they provide unique insights regarding the evolution of communities and their interactions. Relational datasets are naturally non-negative, sparse, and extra-large. Relational data usually contain triples, (subject, relation, object), and are represented as graphs/multigraphs, called knowledge graphs, which need to be embedded into a low-dimensional dense vector space. Among various embedding models, RESCAL allows learning of relational data to extract the posterior distributions over the latent variables and to make predictions of missing relations. However, RESCAL is computationally demanding and requires a fast and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Caching and Content Delivery · Stochastic Gradient Optimization Techniques
