# FLUID: A Common Model for Semantic Structural Graph Summaries Based on   Equivalence Relations

**Authors:** Till Blume, David Richerby, Ansgar Scherp

arXiv: 1908.01528 · 2021-01-05

## TL;DR

This paper introduces FLUID, a formal, flexible model for semantic structural graph summaries that unifies various existing approaches and enables efficient computation and comparison across large-scale graphs.

## Contribution

The paper presents FLUID, a novel formal model for structural graph summaries based on equivalence relations, allowing quick definition, adaptation, and comparison of summaries.

## Key findings

- FLUID can be computed in worst-case O(n^2) time.
- Empirical analysis shows typical running time is linear in the number of edges.
- The model unifies and extends existing graph summarization concepts.

## Abstract

Summarization is a widespread method for handling very large graphs. The task of structural graph summarization is to compute a concise but meaningful synopsis of the key structural information of a graph. As summaries may be used for many different purposes, there is no single concept or model of graph summaries. We have studied existing structural graph summaries for large-scale (semantic) graphs. Despite their different concepts and purposes, we found commonalities in the graph structures they capture. We use these commonalities to provide for the first time a formally defined common model, FLUID (FLexible graph sUmmarIes for Data graphs), that allows us to flexibly define structural graph summaries. FLUID allows graph summaries to be quickly defined, adapted, and compared for different purposes and datasets. To this end, FLUID provides features of structural summarization based on equivalence relations such as distinction of types and properties, direction of edges, bisimulation, and inference. We conduct a detailed complexity analysis of the features provided by FLUID. We show that graph summaries defined with FLUID can be computed in the worst case in time $\mathcal{O}(n^2)$ w.r.t. $n$, the number of edges in the data graph. An empirical analysis of large-scale web graphs with billions of edges indicates a typical running time of $\Theta(n)$. Based on the formal FLUID model, one can quickly define and modify various structural graph summaries from the literature and beyond.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.01528/full.md

## Figures

13 figures with captions in the complete paper: https://tomesphere.com/paper/1908.01528/full.md

## References

45 references — full list in the complete paper: https://tomesphere.com/paper/1908.01528/full.md

---
Source: https://tomesphere.com/paper/1908.01528