A framework for paired-sample hypothesis testing for high-dimensional data
Ioannis Bargiotas, Argyris Kalogeratos, Nicolas Vayatis

TL;DR
This paper introduces a novel high-dimensional paired-sample testing framework that leverages hyperplane-based scoring functions and the Hodges-Lehmann estimator, improving accuracy over traditional methods.
Contribution
It proposes a new two-step testing procedure using hyperplanes and the Hodges-Lehmann estimator to enhance high-dimensional paired-sample hypothesis testing.
Findings
Substantial performance gains in testing accuracy.
Effective estimation of feature contributions.
Outperforms traditional multivariate and multiple testing methods.
Abstract
The standard paired-sample testing approach in the multidimensional setting applies multiple univariate tests on the individual features, followed by p-value adjustments. Such an approach suffers when the data carry numerous features. A number of studies have shown that classification accuracy can be seen as a proxy for two-sample testing. However, neither theoretical foundations nor practical recipes have been proposed so far on how this strategy could be extended to multidimensional paired-sample testing. In this work, we put forward the idea that scoring functions can be produced by the decision rules defined by the perpendicular bisecting hyperplanes of the line segments connecting each pair of instances. Then, the optimal scoring function can be obtained by the pseudomedian of those rules, which we estimate by extending naturally the Hodges-Lehmann estimator. We accordingly propose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Statistical Methods and Inference · Advanced Statistical Process Monitoring
