A Finite Time Analysis of Thompson Sampling for Bayesian Optimization with Preferential Feedback
Joseph Lazzaro, Davide Buffelli, Da-shan Shiu, Sattar Vakili

TL;DR
This paper introduces a Thompson Sampling approach for Bayesian optimization using preferential feedback, providing finite-time performance guarantees and demonstrating effectiveness on synthetic and real-world data.
Contribution
It develops a novel TS-based method for preference-based Bayesian optimization with theoretical analysis and practical validation.
Findings
Performance matches standard TS with scalar feedback in finite time
Method effective on synthetic and real-world preference data
Introduces a double-TS pairing variant for challenger selection
Abstract
Preference feedback, in the form of pairwise comparisons rather than scalar scores, has seen increasing use in applications such as human-, laboratory-, and expert-in-the-loop design, as well as scientific discovery. We propose a Thompson Sampling (TS) approach to Bayesian optimization with preferential feedback that models comparisons using a monotone link on latent utility differences and leverages the dueling kernel induced by a base kernel. We provide a finite-time analysis showing that the performance of the proposed method matches that of standard TS for conventional Bayesian optimization with scalar feedback. The analysis exploits the anchor invariance of TS for challenger selection and introduces a double-TS pairing variant. We also demonstrate the performance of the method on both synthetic and real-world examples.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
