Phase Transitions in the Detection of Correlated Databases
Dor Elimelech, Wasim Huleihel

TL;DR
None
Contribution
None
Abstract
We study the problem of detecting the correlation between two Gaussian databases and , each composed of users with features. This problem is relevant in the analysis of social media, computational biology, etc. We formulate this as a hypothesis testing problem: under the null hypothesis, these two databases are statistically independent. Under the alternative, however, there exists an unknown permutation over the set of users (or, row permutation), such that is -correlated with , a permuted version of . We determine sharp thresholds at which optimal testing exhibits a phase transition, depending on the asymptotic regime of and . Specifically, we prove that if , as , then weak detection (performing slightly better than random…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Statistical Methods and Inference · Data-Driven Disease Surveillance
MethodsTest
