GRANDPA: GeneRAtive Network sampling using Degree and Property   Augmentation applied to the analysis of partially confidential healthcare   networks

Carly A. Bobak; Yifan Zhao; Joshua J. Levy; A. James O'Malley

arXiv:2211.15000·stat.AP·November 29, 2022·Appl. Netw. Sci.

GRANDPA: GeneRAtive Network sampling using Degree and Property Augmentation applied to the analysis of partially confidential healthcare networks

Carly A. Bobak, Yifan Zhao, Joshua J. Levy, A. James O'Malley

PDF

Open Access

TL;DR

GRANDPA is a graph simulation model that generates privacy-preserving healthcare networks by augmenting degree and properties, maintaining key topological and attribute relationships for analysis.

Contribution

Introduces a flexible R package for generating synthetic healthcare networks that preserve community structure and degree distributions, addressing privacy concerns.

Findings

01

Community structure is preserved in generated graphs.

02

Low normalized root mean square error in degree distributions.

03

Effective in modeling healthcare networks from real data.

Abstract

Protecting medical privacy can create obstacles in the analysis and distribution of healthcare graphs and statistical inferences accompanying them. We pose a graph simulation model which generates networks using degree and property augmentation (GRANDPA) and provide a flexible R package that allows users to create graphs that preserve vertex attribute relationships and approximating retaining topological properties observed in the original graph (e.g., community structure). We support our proposed algorithm using a case study based on Zachary's karate network and a patient-sharing graph generated from Medicare claims data in 2019. In both cases, we find that community structure is preserved, and normalized root mean square error between cumulative distributions of the degrees is low (0.0508 and 0.0514 respectively).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData-Driven Disease Surveillance · Health disparities and outcomes · Data Analysis and Archiving