Efficient Controller Learning from Human Preferences and Numerical Data Via Multi-Modal Surrogate Models
Lukas Theiner, Maik Pfefferkorn, Yongpeng Zhao, Sebastian Hirt, Rolf Findeisen

TL;DR
This paper introduces a multi-modal Bayesian optimization framework that efficiently combines numerical data and human preferences to tune control policies, reducing human involvement and improving adaptation to individual preferences.
Contribution
It develops a novel multi-fidelity, multi-modal Bayesian optimization method using Gaussian process models to integrate numerical and preference data for control policy tuning.
Findings
Significantly reduces human-in-the-loop experiments
Effectively adapts driving style to individual preferences
Improves data efficiency in policy optimization
Abstract
Tuning control policies manually to meet high-level objectives is often time-consuming. Bayesian optimization provides a data-efficient framework for automating this process using numerical evaluations of an objective function. However, many systems, particularly those involving humans, require optimization based on subjective criteria. Preferential Bayesian optimization addresses this by learning from pairwise comparisons instead of quantitative measurements, but relying solely on preference data can be inefficient. We propose a multi-fidelity, multi-modal Bayesian optimization framework that integrates low-fidelity numerical data with high-fidelity human preferences. Our approach employs Gaussian process surrogate models with both hierarchical, autoregressive and non-hierarchical, coregionalization-based structures, enabling efficient learning from mixed-modality data. We illustrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Gaussian Processes and Bayesian Inference · Advanced Multi-Objective Optimization Algorithms
