Multi-Task Learning with Summary Statistics
Parker Knight, Rui Duan

TL;DR
This paper introduces a flexible multi-task learning framework that uses summary statistics to overcome data-sharing constraints, with an adaptive parameter selection method and theoretical analysis applicable to real-world scenarios like healthcare.
Contribution
It proposes a novel multi-task learning approach utilizing summary statistics and an adaptive tuning method, supported by non-asymptotic theoretical analysis.
Findings
Effective performance demonstrated through extensive simulations.
Theoretical guarantees under various sample regimes.
Applicable to fields like genetic risk prediction.
Abstract
Multi-task learning has emerged as a powerful machine learning paradigm for integrating data from multiple sources, leveraging similarities between tasks to improve overall model performance. However, the application of multi-task learning to real-world settings is hindered by data-sharing constraints, especially in healthcare settings. To address this challenge, we propose a flexible multi-task learning framework utilizing summary statistics from various sources. Additionally, we present an adaptive parameter selection approach based on a variant of Lepski's method, allowing for data-driven tuning parameter selection when only summary statistics are available. Our systematic non-asymptotic analysis characterizes the performance of the proposed methods under various regimes of the sample complexity and overlap. We demonstrate our theoretical findings and the performance of the method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning in Healthcare · Gene expression and cancer classification
