Knowledge Gradient for Preference Learning
Kaiwen Wu, Jacob R. Gardner

TL;DR
This paper develops an exact analytical knowledge gradient for Bayesian optimization using pairwise preference queries, enabling efficient optimization in settings where direct function evaluations are infeasible.
Contribution
It extends the knowledge gradient to preferential Bayesian optimization by deriving an exact, analytical form, overcoming previous computational intractability.
Findings
Exact knowledge gradient outperforms existing acquisition functions on benchmarks.
The method effectively handles non-Gaussian posteriors in preference-based optimization.
Limitations of the knowledge gradient are also discussed through a case study.
Abstract
The knowledge gradient is a popular acquisition function in Bayesian optimization (BO) for optimizing black-box objectives with noisy function evaluations. Many practical settings, however, allow only pairwise comparison queries, yielding a preferential BO problem where direct function evaluations are unavailable. Extending the knowledge gradient to preferential BO is hindered by its computational challenge. At its core, the look-ahead step in the preferential setting requires computing a non-Gaussian posterior, which was previously considered intractable. In this paper, we address this challenge by deriving an exact and analytical knowledge gradient for preferential BO. We show that the exact knowledge gradient performs strongly on a suite of benchmark problems, often outperforming existing acquisition functions. In addition, we also present a case study illustrating the limitation of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Gaussian Processes and Bayesian Inference · Advanced Multi-Objective Optimization Algorithms
