Kernel Mean Estimation by Marginalized Corrupted Distributions

Xiaobo Xia; Shuo Shan; Mingming Gong; Nannan Wang; Fei Gao; Haikun; Wei; Tongliang Liu

arXiv:2107.04855·cs.LG·July 13, 2021

Kernel Mean Estimation by Marginalized Corrupted Distributions

Xiaobo Xia, Shuo Shan, Mingming Gong, Nannan Wang, Fei Gao, Haikun, Wei, Tongliang Liu

PDF

Open Access

TL;DR

This paper introduces a novel kernel mean estimator that uses data corruption with known noise distributions to improve estimation accuracy in kernel learning, providing theoretical regularization insights and empirical performance gains.

Contribution

It proposes the marginalized kernel mean estimator, a new approach that leverages data corruption to enhance kernel mean estimation with theoretical and empirical validation.

Findings

01

Lower estimation error compared to existing methods

02

Implicit regularization effect demonstrated theoretically

03

Effective across various datasets

Abstract

Estimating the kernel mean in a reproducing kernel Hilbert space is a critical component in many kernel learning algorithms. Given a finite sample, the standard estimate of the target kernel mean is the empirical average. Previous works have shown that better estimators can be constructed by shrinkage methods. In this work, we propose to corrupt data examples with noise from known distributions and present a new kernel mean estimator, called the marginalized kernel mean estimator, which estimates kernel mean under the corrupted distribution. Theoretically, we show that the marginalized kernel mean estimator introduces implicit regularization in kernel mean estimation. Empirically, we show on a variety of datasets that the marginalized kernel mean estimator obtains much lower estimation error than the existing estimators.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and ELM · Gaussian Processes and Bayesian Inference