Distance Correlation in Multiple Biased Sampling Models

Yuwei Ke; Hok Kan Ling; Yanglei Song

arXiv:2408.11808·stat.ME·August 22, 2024

Distance Correlation in Multiple Biased Sampling Models

Yuwei Ke, Hok Kan Ling, Yanglei Song

PDF

Open Access

TL;DR

This paper investigates how to accurately estimate and test independence using distance correlation in data collected from multiple biased sources, addressing selection bias and improving statistical power.

Contribution

It introduces a framework for estimating distance correlation under biased sampling, establishing theoretical properties and proposing a weighted permutation test for better independence testing.

Findings

01

Proves strong consistency and asymptotic null distribution of estimators

02

Develops a weighted permutation procedure for critical value determination

03

Simulation results show improved estimation and testing power

Abstract

Testing the independence between random vectors is a fundamental problem in statistics. Distance correlation, a recently popular dependence measure, is universally consistent for testing independence against all distributions with finite moments. However, when data are subject to selection bias or collected from multiple sources or schemes, spurious dependence may arise. This creates a need for methods that can effectively utilize data from different sources and correct these biases. In this paper, we study the estimation of distance covariance and distance correlation under multiple biased sampling models, which provide a natural framework for addressing these issues. Theoretical properties, including the strong consistency and asymptotic null distributions of the distance covariance and correlation estimators, and the rate at which the test statistic diverges under sequences of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSurvey Sampling and Estimation Techniques