Learning the MPC objective function from human preferences
Pablo Krupa, Hasna El Hasnaouy, Mario Zanon, Alberto Bemporad

TL;DR
This paper introduces a data-driven method to learn MPC objective functions from human preferences, enabling the design of control systems that better align with human judgments without explicit modeling.
Contribution
It proposes a preference-based learning framework that constructs MPC objectives directly from human preference data, bypassing complex manual design.
Findings
Learned objective functions produce trajectories aligned with human preferences
The approach models preference learning as a classification task
Numerical results validate the effectiveness of the learned objectives
Abstract
In Model Predictive Control (MPC), the objective function plays a central role in determining the closed-loop behavior of the system, and must therefore be designed to achieve the desired closed-loop performance. However, in real-world scenarios, its design is often challenging, as it requires balancing complex trade-offs and accurately capturing a performance criterion that may not be easily quantifiable in terms of an objective function. This paper explores preference-based learning as a data-driven approach to constructing an objective function from human preferences over trajectory pairs. We formulate the learning problem as a machine learning classification task to learn a surrogate model that estimates the likelihood of a trajectory being preferred over another. The approach provides a surrogate model that can directly be used as an MPC objective function. Numerical results show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Control Systems Optimization · Control Systems and Identification · Advanced Multi-Objective Optimization Algorithms
