Post Selection Inference with Incomplete Maximum Mean Discrepancy Estimator
Makoto Yamada, Denny Wu, Yao-Hung Hubert Tsai, Ichiro Takeuchi, Ruslan, Salakhutdinov, Kenji Fukumizu

TL;DR
This paper introduces a post selection inference framework using an incomplete U-statistics based MMD estimator to identify significant features discriminating two distributions, with applications in high-dimensional data and GAN analysis.
Contribution
It proposes a novel MMD estimator with asymptotic Normality and a general hypothesis test for feature selection in divergence measurement.
Findings
Successfully detects significant features in synthetic and real data
Provides a high detection power with the incomplete U-statistics MMD estimator
Enables analysis of GAN members through sample selection
Abstract
Measuring divergence between two distributions is essential in machine learning and statistics and has various applications including binary classification, change point detection, and two-sample test. Furthermore, in the era of big data, designing divergence measure that is interpretable and can handle high-dimensional and complex data becomes extremely important. In the paper, we propose a post selection inference (PSI) framework for divergence measure, which can select a set of statistically significant features that discriminate two distributions. Specifically, we employ an additive variant of maximum mean discrepancy (MMD) for features and introduce a general hypothesis test for PSI. A novel MMD estimator using the incomplete U-statistics, which has an asymptotically Normal distribution (under mild assumptions) and gives high detection power in PSI, is also proposed and analyzed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Statistical Methods and Inference · Adversarial Robustness in Machine Learning
