Extending Kernel Testing To General Designs
Anthony Ozier-Lafontaine, Polina Arsenteva, Franck Picard, Bertrand, Michel

TL;DR
This paper extends kernel-based non-parametric testing methods from simple two-sample scenarios to complex experimental designs by introducing a linear model in RKHS and a new test statistic, broadening applicability.
Contribution
It proposes a linear model in RKHS for general designs and introduces a truncated kernel Hotelling-Lawley statistic with proven asymptotic properties, enabling kernel testing beyond two-sample cases.
Findings
The new test statistic follows a chi-square distribution asymptotically.
The framework is demonstrated on single-cell RNA sequencing data.
Kernel-based diagnostic tools are generalized for complex designs.
Abstract
Kernel-based testing has revolutionized the field of non-parametric tests through the embedding of distributions in an RKHS. This strategy has proven to be powerful and flexible, yet its applicability has been limited to the standard two-sample case, while practical situations often involve more complex experimental designs. To extend kernel testing to any design, we propose a linear model in the RKHS that allows for the decomposition of mean embeddings into additive functional effects. We then introduce a truncated kernel Hotelling-Lawley statistic to test the effects of the model, demonstrating that its asymptotic distribution is chi-square, which remains valid with its Nystrom approximation. We discuss a homoscedasticity assumption that, although absent in the standard two-sample case, is necessary for general designs. Finally, we illustrate our framework using a single-cell RNA…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSingle-cell and spatial transcriptomics · Statistical Methods and Inference · Gene expression and cancer classification
