SANGEA: Scalable and Attributed Network Generation
Valentin Lemaire, Youssef Achenchabe, Lucas Ody, Houssem Eddine Souid,, Gianmarco Aversano, Nicolas Posocco, Sabri Skhiri

TL;DR
SANGEA is a scalable framework for generating large synthetic attributed graphs by dividing the graph into communities, training on each, and then linking them, maintaining high utility and privacy.
Contribution
It introduces a novel community-based approach to scale synthetic graph generation, enabling application to large graphs while preserving utility and privacy.
Findings
Generated graphs closely match original topology and features.
High utility in downstream tasks like link prediction.
Reasonable privacy scores achieved.
Abstract
The topic of synthetic graph generators (SGGs) has recently received much attention due to the wave of the latest breakthroughs in generative modelling. However, many state-of-the-art SGGs do not scale well with the graph size. Indeed, in the generation process, all the possible edges for a fixed number of nodes must often be considered, which scales in , with being the number of nodes in the graph. For this reason, many state-of-the-art SGGs are not applicable to large graphs. In this paper, we present SANGEA, a sizeable synthetic graph generation framework which extends the applicability of any SGG to large graphs. By first splitting the large graph into communities, SANGEA trains one SGG per community, then links the community graphs back together to create a synthetic large graph. Our experiments show that the graphs generated by SANGEA have high similarity to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Advanced Graph Neural Networks · Data Visualization and Analytics
