A Unified Data Representation Learning for Non-parametric Two-sample Testing
Xunye Tian, Liuhua Peng, Zhijian Zhou, Mingming Gong, Arthur Gretton,, Feng Liu

TL;DR
This paper introduces RL-TST, a novel framework for non-parametric two-sample testing that leverages entire datasets for representation learning without compromising error control, improving test power.
Contribution
It proposes a unified self-supervised and discriminative representation learning framework that utilizes the full dataset for more effective two-sample testing.
Findings
RL-TST outperforms existing methods in experiments.
Utilizes both data manifold and discriminative features.
Enhances test power while controlling Type-I errors.
Abstract
Learning effective data representations has been crucial in non-parametric two-sample testing. Common approaches will first split data into training and test sets and then learn data representations purely on the training set. However, recent theoretical studies have shown that, as long as the sample indexes are not used during the learning process, the whole data can be used to learn data representations, meanwhile ensuring control of Type-I errors. The above fact motivates us to use the test set (but without sample indexes) to facilitate the data representation learning in the testing. To this end, we propose a representation-learning two-sample testing (RL-TST) framework. RL-TST first performs purely self-supervised representation learning on the entire dataset to capture inherent representations (IRs) that reflect the underlying data manifold. A discriminative model is then trained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Advanced Statistical Methods and Models · Fault Detection and Control Systems
MethodsSparse Evolutionary Training
