Active Preference-Based Gaussian Process Regression for Reward Learning
Erdem B{\i}y{\i}k, Nicolas Huynh, Mykel J. Kochenderfer, Dorsa Sadigh

TL;DR
This paper introduces an active preference-based Gaussian Process method for reward learning that efficiently infers expressive reward functions from human trajectory preferences, addressing data inefficiency and structural constraints.
Contribution
It proposes a novel active learning framework using Gaussian Processes to learn reward functions solely from human preferences without assuming strict reward structures.
Findings
Efficiently learns reward functions from limited human preferences.
Outperforms existing methods in simulation and user studies.
Handles high-dimensional robotic tasks effectively.
Abstract
Designing reward functions is a challenging problem in AI and robotics. Humans usually have a difficult time directly specifying all the desirable behaviors that a robot needs to optimize. One common approach is to learn reward functions from collected expert demonstrations. However, learning reward functions from demonstrations introduces many challenges: some methods require highly structured models, e.g. reward functions that are linear in some predefined set of features, while others adopt less structured reward functions that on the other hand require tremendous amount of data. In addition, humans tend to have a difficult time providing demonstrations on robots with high degrees of freedom, or even quantifying reward values for given demonstrations. To address these challenges, we present a preference-based learning approach, where as an alternative, the human feedback is only in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Advanced Control Systems Optimization · Advanced Multi-Objective Optimization Algorithms
MethodsGaussian Process
