High-Dimensional Multi-Task Averaging and Application to Kernel Mean Embedding
Hannah Marienwald (TUB), Jean-Baptiste Fermanian (ENS Rennes), Gilles, Blanchard (DATASHAPE, LMO, CNRS)

TL;DR
This paper introduces a multi-task averaging estimator that leverages task similarities to improve mean estimation accuracy, especially in high-dimensional settings, with applications to kernel mean embeddings.
Contribution
It proposes a novel shrinkage-based estimator that exploits unknown task similarities, reducing mean squared error in high-dimensional multi-task problems.
Findings
The estimator reduces mean squared error compared to naive methods.
The approach demonstrates a 'blessing of dimensionality' in high-dimensional spaces.
The method improves kernel mean embedding estimation in practical applications.
Abstract
We propose an improved estimator for the multi-task averaging problem, whose goal is the joint estimation of the means of multiple distributions using separate, independent data sets. The naive approach is to take the empirical mean of each data set individually, whereas the proposed method exploits similarities between tasks, without any related information being known in advance. First, for each data set, similar or neighboring means are determined from the data by multiple testing. Then each naive estimator is shrunk towards the local average of its neighbors. We prove theoretically that this approach provides a reduction in mean squared error. This improvement can be significant when the dimension of the input space is large, demonstrating a "blessing of dimensionality" phenomenon. An application of this approach is the estimation of multiple kernel mean embeddings, which plays an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Statistical Methods and Inference · Sparse and Compressive Sensing Techniques
