Multi-Task Learning with Summary Statistics

Parker Knight; Rui Duan

arXiv:2307.02388·stat.ME·February 9, 2024·NeurIPS·2 cites

Multi-Task Learning with Summary Statistics

Parker Knight, Rui Duan

PDF

Open Access 1 Video

TL;DR

This paper introduces a flexible multi-task learning framework that uses summary statistics to overcome data-sharing constraints, with an adaptive parameter selection method and theoretical analysis applicable to real-world scenarios like healthcare.

Contribution

It proposes a novel multi-task learning approach utilizing summary statistics and an adaptive tuning method, supported by non-asymptotic theoretical analysis.

Findings

01

Effective performance demonstrated through extensive simulations.

02

Theoretical guarantees under various sample regimes.

03

Applicable to fields like genetic risk prediction.

Abstract

Multi-task learning has emerged as a powerful machine learning paradigm for integrating data from multiple sources, leveraging similarities between tasks to improve overall model performance. However, the application of multi-task learning to real-world settings is hindered by data-sharing constraints, especially in healthcare settings. To address this challenge, we propose a flexible multi-task learning framework utilizing summary statistics from various sources. Additionally, we present an adaptive parameter selection approach based on a variant of Lepski's method, allowing for data-driven tuning parameter selection when only summary statistics are available. Our systematic non-asymptotic analysis characterizes the performance of the proposed methods under various regimes of the sample complexity and overlap. We demonstrate our theoretical findings and the performance of the method…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Multi-task learning with summary statistics· slideslive

Taxonomy

TopicsMachine Learning and Data Classification · Machine Learning in Healthcare · Gene expression and cancer classification