TL;DR
This paper introduces a Bayesian method to evaluate and utilize metadata for predicting network structure and missing nodes, distinguishing between data quality and algorithm performance.
Contribution
It presents a joint generative model and a nonparametric Bayesian framework to assess metadata usefulness in network analysis, focusing on edge prediction rather than direct community alignment.
Findings
Metadata often predicts network edges effectively despite imperfect alignment.
The method improves missing node prediction using metadata.
Metadata contains meaningful structural information even with imperfect correlation.
Abstract
The empirical validation of community detection methods is often based on available annotations on the nodes that serve as putative indicators of the large-scale network structure. Most often, the suitability of the annotations as topological descriptors itself is not assessed, and without this it is not possible to ultimately distinguish between actual shortcomings of the community detection algorithms on one hand, and the incompleteness, inaccuracy or structured nature of the data annotations themselves on the other. In this work we present a principled method to access both aspects simultaneously. We construct a joint generative model for the data and metadata, and a nonparametric Bayesian framework to infer its parameters from annotated datasets. We assess the quality of the metadata not according to its direct alignment with the network communities, but rather in its capacity to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
