A Statistical Test for Joint Distributions Equivalence
Francesco Solera, Andrea Palazzi

TL;DR
This paper introduces a distribution-free statistical test based on joint kernel distribution embedding to determine if two joint distributions differ, applicable to dataset-shift detection in machine learning without restrictive assumptions.
Contribution
It extends the kernel two-sample test to joint distributions, enabling distribution comparison without assumptions on the nature of the shift.
Findings
Effective in detecting dataset-shift in various scenarios
Applicable without assumptions on the type of distribution change
Provides a practical tool for model validation
Abstract
We provide a distribution-free test that can be used to determine whether any two joint distributions and are statistically different by inspection of a large enough set of samples. Following recent efforts from Long et al. [1], we rely on joint kernel distribution embedding to extend the kernel two-sample test of Gretton et al. [2] to the case of joint probability distributions. Our main result can be directly applied to verify if a dataset-shift has occurred between training and test distributions in a learning framework, without further assuming the shift has occurred only in the input, in the target or in the conditional distribution.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Anomaly Detection Techniques and Applications · Machine Learning and Algorithms
