Efficient Controller Learning from Human Preferences and Numerical Data Via Multi-Modal Surrogate Models

Lukas Theiner; Maik Pfefferkorn; Yongpeng Zhao; Sebastian Hirt; Rolf Findeisen

arXiv:2603.24138·cs.LG·March 26, 2026

Efficient Controller Learning from Human Preferences and Numerical Data Via Multi-Modal Surrogate Models

Lukas Theiner, Maik Pfefferkorn, Yongpeng Zhao, Sebastian Hirt, Rolf Findeisen

PDF

Open Access

TL;DR

This paper introduces a multi-modal Bayesian optimization framework that efficiently combines numerical data and human preferences to tune control policies, reducing human involvement and improving adaptation to individual preferences.

Contribution

It develops a novel multi-fidelity, multi-modal Bayesian optimization method using Gaussian process models to integrate numerical and preference data for control policy tuning.

Findings

01

Significantly reduces human-in-the-loop experiments

02

Effectively adapts driving style to individual preferences

03

Improves data efficiency in policy optimization

Abstract

Tuning control policies manually to meet high-level objectives is often time-consuming. Bayesian optimization provides a data-efficient framework for automating this process using numerical evaluations of an objective function. However, many systems, particularly those involving humans, require optimization based on subjective criteria. Preferential Bayesian optimization addresses this by learning from pairwise comparisons instead of quantitative measurements, but relying solely on preference data can be inefficient. We propose a multi-fidelity, multi-modal Bayesian optimization framework that integrates low-fidelity numerical data with high-fidelity human preferences. Our approach employs Gaussian process surrogate models with both hierarchical, autoregressive and non-hierarchical, coregionalization-based structures, enabling efficient learning from mixed-modality data. We illustrate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAutonomous Vehicle Technology and Safety · Gaussian Processes and Bayesian Inference · Advanced Multi-Objective Optimization Algorithms