Eliciting User Preferences for Personalized Multi-Objective Decision   Making through Comparative Feedback

Han Shao; Lee Cohen; Avrim Blum; Yishay Mansour; Aadirupa Saha,; Matthew R. Walter

arXiv:2302.03805·cs.LG·November 2, 2023

Eliciting User Preferences for Personalized Multi-Objective Decision Making through Comparative Feedback

Han Shao, Lee Cohen, Avrim Blum, Yishay Mansour, Aadirupa Saha,, Matthew R. Walter

PDF

Open Access 1 Video

TL;DR

This paper introduces a framework for personalized multi-objective decision making that learns user preferences through comparative feedback, enabling the computation of near-optimal policies tailored to individual priorities.

Contribution

It proposes a novel approach to incorporate user preferences into multi-objective RL via comparison-based feedback models and efficient algorithms for preference elicitation.

Findings

01

Effective algorithms for preference elicitation with few comparison queries

02

Ability to learn personalized policies in multi-objective settings

03

Framework accommodates different types of user feedback

Abstract

In classic reinforcement learning (RL) and decision making problems, policies are evaluated with respect to a scalar reward function, and all optimal policies are the same with regards to their expected return. However, many real-world problems involve balancing multiple, sometimes conflicting, objectives whose relative priority will vary according to the preferences of each user. Consequently, a policy that is optimal for one user might be sub-optimal for another. In this work, we propose a multi-objective decision making framework that accommodates different user preferences over objectives, where preferences are learned via policy comparisons. Our model consists of a Markov decision process with a vector-valued reward function, with each user having an unknown preference vector that expresses the relative importance of each objective. The goal is to efficiently compute a near-optimal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Eliciting User Preferences for Personalized Multi-Objective Decision Making through Comparative Feedback· slideslive

Taxonomy

TopicsRecommender Systems and Techniques · Multi-Criteria Decision Making · Reinforcement Learning in Robotics