Bayes-Optimal Entropy Pursuit for Active Choice-Based Preference Learning
Stephen N. Pallone, Peter I. Frazier, and Shane G. Henderson

TL;DR
This paper develops a Bayesian active learning framework for preference learning using choice-based queries, demonstrating that a greedy entropy reduction policy is optimal under certain conditions and comparing it with other strategies.
Contribution
It introduces a Bayesian optimal policy for preference learning via choice queries, showing its optimality and providing bounds on performance related to entropy and error.
Findings
Greedy entropy reduction policy is Bayes-optimal under certain noise assumptions.
Optimal policy's performance bounds are linear in differential entropy.
Numerical comparisons show the greedy policy's effectiveness against other strategies.
Abstract
We analyze the problem of learning a single user's preferences in an active learning setting, sequentially and adaptively querying the user over a finite time horizon. Learning is conducted via choice-based queries, where the user selects her preferred option among a small subset of offered alternatives. These queries have been shown to be a robust and efficient way to learn an individual's preferences. We take a parametric approach and model the user's preferences through a linear classifier, using a Bayesian prior to encode our current knowledge of this classifier. The rate at which we learn depends on the alternatives offered at every time epoch. Under certain noise assumptions, we show that the Bayes-optimal policy for maximally reducing entropy of the posterior distribution of this linear classifier is a greedy policy, and that this policy achieves a linear lower bound when…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Advanced Bandit Algorithms Research · Data Stream Mining Techniques
