Multitask Learning and Bandits via Robust Statistics
Kan Xu, Hamsa Bastani

TL;DR
This paper introduces a robust multitask learning method combining robust statistics and LASSO to improve sample efficiency and regret bounds in bandit problems with related heterogeneous tasks, validated on real data.
Contribution
It proposes a novel two-stage estimator exploiting shared and sparse structures, leading to exponential sample complexity improvements for data-poor instances.
Findings
Enhanced sample complexity bounds in feature dimension d
Improved regret bounds in contextual bandit algorithms
Validated effectiveness on synthetic and real datasets
Abstract
Decision-makers often simultaneously face many related but heterogeneous learning problems. For instance, a large retailer may wish to learn product demand at different stores to solve pricing or inventory problems, making it desirable to learn jointly for stores serving similar customers; alternatively, a hospital network may wish to learn patient risk at different providers to allocate personalized interventions, making it desirable to learn jointly for hospitals serving similar patient populations. Motivated by real datasets, we study a natural setting where the unknown parameter in each learning instance can be decomposed into a shared global parameter plus a sparse instance-specific term. We propose a novel two-stage multitask learning estimator that exploits this structure in a sample-efficient way, using a unique combination of robust statistics (to learn across similar…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Learning Across Bandits in High Dimension via Robust Statistics· youtube
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Distributed Sensor Networks and Detection Algorithms
