A Statistical Test for Joint Distributions Equivalence

Francesco Solera; Andrea Palazzi

arXiv:1607.07270·cs.LG·July 26, 2016·1 cites

A Statistical Test for Joint Distributions Equivalence

Francesco Solera, Andrea Palazzi

PDF

Open Access

TL;DR

This paper introduces a distribution-free statistical test based on joint kernel distribution embedding to determine if two joint distributions differ, applicable to dataset-shift detection in machine learning without restrictive assumptions.

Contribution

It extends the kernel two-sample test to joint distributions, enabling distribution comparison without assumptions on the nature of the shift.

Findings

01

Effective in detecting dataset-shift in various scenarios

02

Applicable without assumptions on the type of distribution change

03

Provides a practical tool for model validation

Abstract

We provide a distribution-free test that can be used to determine whether any two joint distributions $p$ and $q$ are statistically different by inspection of a large enough set of samples. Following recent efforts from Long et al. [1], we rely on joint kernel distribution embedding to extend the kernel two-sample test of Gretton et al. [2] to the case of joint probability distributions. Our main result can be directly applied to verify if a dataset-shift has occurred between training and test distributions in a learning framework, without further assuming the shift has occurred only in the input, in the target or in the conditional distribution.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Anomaly Detection Techniques and Applications · Machine Learning and Algorithms