Relationship Queries on Large graphs using Pregel
Puneet Agarwal, Maya Ramanath, Gautam Shroff

TL;DR
This paper introduces a distributed algorithm for relationship queries on large graphs using Pregel, enabling scalable and efficient analysis of massive graph-structured data like linked-open-data.
Contribution
The paper presents a novel distributed keyword search algorithm for large graphs based on Pregel, with proofs of optimality and practical implementation details.
Findings
Efficient processing of relationship queries on graphs with billions of nodes.
Algorithm produces optimal ranked answers when run to completion.
Experimental results demonstrate scalability and efficiency on large-scale linked-open-data.
Abstract
Large-scale graph-structured data arising from social networks, databases, knowledge bases, web graphs, etc. is now available for analysis and mining. Graph-mining often involves 'relationship queries', which seek a ranked list of interesting interconnections among a given set of entities, corresponding to nodes in the graph. While relationship queries have been studied for many years, using various terminologies, e.g., keyword-search, Steiner-tree in a graph etc., the solutions proposed in the literature so far have not focused on scaling relationship queries to large graphs having billions of nodes and edges, such are now publicly available in the form of 'linked-open-data'. In this paper, we present an algorithm for distributed keyword search (DKS) on large graphs, based on the graph-parallel computing paradigm Pregel. We also present an analytical proof that our algorithm produces…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Advanced Database Systems and Queries · Graph Theory and Algorithms
