Active Learning MPC Objective Functions from Preferences
Hasna El Hasnaouy, Pablo Krupa, Mario Zanon, Alberto Bemporad

TL;DR
This paper introduces active learning strategies to efficiently learn MPC objective functions from human preferences, reducing the number of queries needed for effective control.
Contribution
It proposes two novel active learning methods for preference-based MPC, improving sampling efficiency and alignment with human preferences in fewer queries.
Findings
Proposed strategies outperform random sampling in aligning control with preferences
Fewer queries are needed to achieve desired control behavior
Numerical results validate the effectiveness of the active learning approaches
Abstract
Designing the objective function in Model Predictive Control (MPC) is challenging when performance assessment criteria are available only from human judgment. We adopt a preference-based learning (PbL) approach to learn the MPC objective function from preferences over trajectory pairs. However, the real-world application of PbL is often restricted by the significant cost or limited availability of human preference queries. To address this, Active Learning (AL) strategies seek to improve sampling efficiency, reducing the labeling effort required to obtain a well-performing classifier. We present two AL strategies for learning the MPC objective function from human preferences over pairwise system trajectories: a pool-based strategy that selects trajectory pairs that are both uncertain under the current surrogate and diverse relative to previously labeled comparisons, and a query-synthesis…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
