Thompson Sampling for Multi-Objective Linear Contextual Bandit

Somangchan Park; Heesang Ann; and Min-hwan Oh

arXiv:2512.00930·stat.ML·December 2, 2025

Thompson Sampling for Multi-Objective Linear Contextual Bandit

Somangchan Park, Heesang Ann, and Min-hwan Oh

PDF

Open Access 1 Video

TL;DR

This paper introduces MOL-TS, a novel Thompson Sampling algorithm for multi-objective linear contextual bandits, providing Pareto regret guarantees and demonstrating improved empirical performance over existing methods.

Contribution

MOL-TS is the first Thompson Sampling algorithm with Pareto regret guarantees for multi-objective linear bandits, efficiently balancing multiple conflicting objectives.

Findings

01

Achieves a worst-case Pareto regret of ^{3/2} extsqrt{T}

02

Outperforms existing methods in empirical regret minimization

03

Effectively balances multiple objectives in experiments

Abstract

We study the multi-objective linear contextual bandit problem, where multiple possible conflicting objectives must be optimized simultaneously. We propose \texttt{MOL-TS}, the \textit{first} Thompson Sampling algorithm with Pareto regret guarantees for this problem. Unlike standard approaches that compute an empirical Pareto front each round, \texttt{MOL-TS} samples parameters across objectives and efficiently selects an arm from a novel \emph{effective Pareto front}, which accounts for repeated selections over time. Our analysis shows that \texttt{MOL-TS} achieves a worst-case Pareto regret bound of $O (d^{3/2} T)$ , where $d$ is the dimension of the feature vectors, $T$ is the total number of rounds, matching the best known order for randomized linear bandit algorithms for single objective. Empirical results confirm the benefits of our proposed approach, demonstrating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Thompson Sampling for Multi-Objective Linear Contextual Bandit· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Advanced Multi-Objective Optimization Algorithms