EmbAssi: Embedding Assignment Costs for Similarity Search in Large Graph Databases
Franka Bause, Erich Schubert, Nils M. Kriege

TL;DR
EmbAssi introduces novel embedding-based lower bounds for graph edit distance, enabling efficient similarity search in large graph databases by reducing computational costs through effective filtering.
Contribution
The paper presents a new approach to embed assignment costs into $ ext{l}_1$ space using tree metrics, significantly improving filtering efficiency for large graph databases.
Findings
Lower bounds are close to exact graph edit distances in real-world graphs.
Index construction and search scale to databases with millions of graphs.
Embedding costs into $ ext{l}_1$ space enables fast filtering in large datasets.
Abstract
The graph edit distance is an intuitive measure to quantify the dissimilarity of graphs, but its computation is NP-hard and challenging in practice. We introduce methods for answering nearest neighbor and range queries regarding this distance efficiently for large databases with up to millions of graphs. We build on the filter-verification paradigm, where lower and upper bounds are used to reduce the number of exact computations of the graph edit distance. Highly effective bounds for this involve solving a linear assignment problem for each graph in the database, which is prohibitive in massive datasets. Index-based approaches typically provide only weak bounds leading to high computational costs verification. In this work, we derive novel lower bounds for efficient filtering from restricted assignment problems, where the cost function is a tree metric. This special case allows…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Advanced Graph Neural Networks · Advanced Database Systems and Queries
