Co-Exploration and Co-Exploitation via Shared Structure in Multi-Task Bandits
Sumantrak Mukherjee, Serafima Lebedeva, Valentin Margraf, Jonas Hanselle, Kanta Yamaoka, Viktor Bengs, Stefan Konigorski, Eyke H\"ullermeier, Sebastian Josef Vollmer

TL;DR
This paper introduces a Bayesian framework for multi-task bandits that leverages shared structure to improve exploration, especially under partial observations and complex latent dependencies.
Contribution
It presents a novel particle-based Bayesian approach that models joint task-reward distributions, enabling flexible discovery of inter-task and inter-arm dependencies.
Findings
Outperforms hierarchical bandit models in complex settings
Effectively handles model misspecification
Learns latent reward structures without prior assumptions
Abstract
We propose a novel Bayesian framework for efficient exploration in contextual multi-task multi-armed bandit settings, where the context is only observed partially and dependencies between reward distributions are induced by latent context variables. In order to exploit these structural dependencies, our approach integrates observations across all tasks and learns a global joint distribution, while still allowing personalised inference for new tasks. In this regard, we identify two key sources of epistemic uncertainty, namely structural uncertainty in the latent reward dependencies across arms and tasks, and user-specific uncertainty due to incomplete context and limited interaction history. To put our method into practice, we represent the joint distribution over tasks and rewards using a particle-based approximation of a log-density Gaussian process. This representation enables…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Explainable Artificial Intelligence (XAI) · Gaussian Processes and Bayesian Inference
