Answer Graph: Factorization Matters in Large Graphs

Zahid Abul-Basher; Nikolay Yakovets; Parke Godfrey; Stanley Clark and; Mark Chignell

arXiv:2011.04838·cs.DB·November 11, 2020

Answer Graph: Factorization Matters in Large Graphs

Zahid Abul-Basher, Nikolay Yakovets, Parke Godfrey, Stanley Clark and, Mark Chignell

PDF

Open Access

TL;DR

This paper introduces the answer-graph method for evaluating SPARQL conjunctive queries, which uses factorization to reduce evaluation costs and enables a cost-based planner, demonstrated through a prototype system and benchmark comparisons.

Contribution

The paper presents a novel answer-graph approach that factorizes answer sets to improve query evaluation efficiency and supports cost-based query planning.

Findings

01

Answer-graph reduces query evaluation costs significantly.

02

Prototype system Wireframe demonstrates performance gains.

03

Benchmark results show advantages over existing systems.

Abstract

Our answer-graph method to evaluate SPARQL conjunctive queries (CQs) finds a factorized answer set first, an answer graph, and then finds the embedding tuples from this. This approach can reduce greatly the cost to evaluate CQs. This affords a second advantage: we can construct a cost-based planner. We present the answer-graph approach, and overview our prototype system, Wireframe. We then offer proof of concept via a micro-benchmark over the YAGO2s dataset with two prevalent shapes of queries, snowflake and diamond. We compare Wireframe's performance over these against PostgreSQL, Virtuoso, MonetDB, and Neo4J to illustrate the performance advantages of our answer-graph approach.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Natural Language Processing Techniques · Topic Modeling