Distance Correlation in Multiple Biased Sampling Models
Yuwei Ke, Hok Kan Ling, Yanglei Song

TL;DR
This paper investigates how to accurately estimate and test independence using distance correlation in data collected from multiple biased sources, addressing selection bias and improving statistical power.
Contribution
It introduces a framework for estimating distance correlation under biased sampling, establishing theoretical properties and proposing a weighted permutation test for better independence testing.
Findings
Proves strong consistency and asymptotic null distribution of estimators
Develops a weighted permutation procedure for critical value determination
Simulation results show improved estimation and testing power
Abstract
Testing the independence between random vectors is a fundamental problem in statistics. Distance correlation, a recently popular dependence measure, is universally consistent for testing independence against all distributions with finite moments. However, when data are subject to selection bias or collected from multiple sources or schemes, spurious dependence may arise. This creates a need for methods that can effectively utilize data from different sources and correct these biases. In this paper, we study the estimation of distance covariance and distance correlation under multiple biased sampling models, which provide a natural framework for addressing these issues. Theoretical properties, including the strong consistency and asymptotic null distributions of the distance covariance and correlation estimators, and the rate at which the test statistic diverges under sequences of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSurvey Sampling and Estimation Techniques
