Bridging RL and MPC for mixed-integer optimal control with application to Formula 1 race strategies

Joschua W\"uthrich; Romir Damle; Giona Fieni; Melanie N. Zeilinger; Christopher H. Onder; Andrea Carron

arXiv:2604.00826·eess.SY·April 2, 2026

Bridging RL and MPC for mixed-integer optimal control with application to Formula 1 race strategies

Joschua W\"uthrich, Romir Damle, Giona Fieni, Melanie N. Zeilinger, Christopher H. Onder, Andrea Carron

PDF

TL;DR

This paper introduces a hybrid RL-MPC framework for mixed-integer optimal control, demonstrating its effectiveness on Formula 1 race strategies by combining learning and optimization for improved performance and adaptability.

Contribution

It presents a novel integrated RL and MPC approach trained on the full hybrid action space, ensuring consistency and enabling zero-retraining adaptation to disturbances.

Findings

01

Achieves near-optimal performance compared to offline benchmarks.

02

Outperforms standalone RL agents in the race strategy task.

03

Enables disturbance adaptation through modular MPC extensions.

Abstract

We propose a hybrid reinforcement learning (RL) and model predictive control (MPC) framework for mixed-integer optimal control, where discrete variables enter the cost and dynamics but not the constraints. Existing hierarchical approaches use RL only for the discrete action space, leaving continuous optimization to MPC. Unlike these methods, we train the RL agent on the full hybrid action space, ensuring consistency with the cost of the underlying Markov decision process. During deployment, the RL actor is rolled out over the prediction horizon to parametrize an integer-free nonlinear MPC through the discrete action sequence and provide a continuous warm-start. The learned critic serves as a terminal cost to capture long-term performance. We prove recursive feasibility, and validate the framework on a Formula 1 race strategy problem. The hybrid method achieves near-optimal performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.