A novel method for assessing and measuring homophily in networks through second-order statistics
Nicola Apollonio, Paolo Giulio Franciosa, Daniele Santoni

TL;DR
This paper introduces an efficient, null-model-based method for quantifying homophily in networks with categorical node attributes, demonstrated on biological and social networks, with explicit statistical significance measures.
Contribution
The novel approach prescribes an endogenous null model allowing exact z-score calculations for homophily, improving assessment accuracy and computational efficiency.
Findings
Networks are significantly homophilic with respect to node attributes
Method provides explicit z-scores for homophily significance
Computational complexity is linear in network size
Abstract
We present a new method for assessing and measuring homophily in networks whose nodes have categorical attributes, namely when the nodes of networks come partitioned into classes (colors). We probe this method in two different classes of networks: i) protein-protein interaction (PPI) networks, where nodes correspond to proteins, partitioned according to their functional role, and edges represent functional interactions between proteins ii) Pokec on-line social network, where nodes correspond to users, partitioned according to their age, and edges respresent friendship between users. Similarly to other classical and well consolidated approaches, our method compares the relative edge density of the subgraphs induced by each class with the corresponding expected relative edge density under a null model. The novelty of our approach consists in prescribing an endogenous null model, namely,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Bioinformatics and Genomic Networks · Topological and Geometric Data Analysis
