Mining tree-query associations in graphs
Eveline Hoekx, Jan Van den Bussche

TL;DR
This paper introduces a new class of tree-shaped patterns called tree queries for data mining in large graph datasets, with algorithms that efficiently mine these patterns and their associations, demonstrated on biological and social data.
Contribution
It presents a novel class of tree queries that include constants and existential nodes, along with algorithms that have provable optimality properties for mining these patterns.
Findings
Algorithms are effective on biological and social network data.
The approach is implemented in SQL and shows practical efficiency.
Tree queries can include constants and existential nodes for flexible pattern matching.
Abstract
New applications of data mining, such as in biology, bioinformatics, or sociology, are faced with large datasetsstructured as graphs. We introduce a novel class of tree-shapedpatterns called tree queries, and present algorithms for miningtree queries and tree-query associations in a large data graph. Novel about our class of patterns is that they can containconstants, and can contain existential nodes which are not counted when determining the number of occurrences of the patternin the data graph. Our algorithms have a number of provableoptimality properties, which are based on the theory of conjunctive database queries. We propose a practical, database-oriented implementation in SQL, and show that the approach works in practice through experiments on data about food webs, protein interactions, and citation analysis.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications · Data Management and Algorithms · Advanced Database Systems and Queries
