Information-Theoretic and Computational Limits of Correlation Detection under Graph Sampling
Dong Huang, Pengkun Yang

TL;DR
This paper investigates the limits of detecting correlation between two Erdős-Rényi graphs using information theory and algorithms, revealing a statistical-computational gap and proposing optimal polynomial-time tests.
Contribution
It characterizes the sample complexity for correlation detection, proposes rate-optimal algorithms, and provides evidence of computational hardness, highlighting a statistical-computational gap.
Findings
Sample complexity rates are optimal up to subpolynomial factors.
Polynomial-time tests based on motif counting succeed in certain regimes.
Evidence suggests computational hardness matches achievable guarantees.
Abstract
Correlation analysis is a fundamental problem in statistics. In this paper, we consider the correlation detection problem between a pair of Erdos-Renyi graphs. Specifically, the problem is formulated as a hypothesis testing problem: under the null hypothesis, the two graphs are independent; under the alternative hypothesis, the two graphs are edge-correlated through a latent permutation. We focus on the scenario where only two induced subgraphs are sampled, and characterize the sample size threshold for detection. At the information-theoretic level, we establish the sample complexity rates that are optimal up to constant factors over most parameter regimes, and the remaining gap is bounded by a subpolynomial factor. On the algorithmic side, we propose polynomial-time tests based on counting trees and bounded degree motifs, and identify the regimes where they succeed. Moreover,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Statistical Methods and Inference · Bayesian Methods and Mixture Models
