Impact of Markov Decision Process Design on Sim-to-Real Reinforcement Learning

Tatjana Krau; Jorge Mandlmaier; Tobias Damm; Frieder Heieck

arXiv:2603.09427·cs.LG·March 13, 2026

Impact of Markov Decision Process Design on Sim-to-Real Reinforcement Learning

Tatjana Krau, Jorge Mandlmaier, Tobias Damm, Frieder Heieck

PDF

Open Access

TL;DR

This paper systematically examines how MDP design choices influence the transferability of RL policies from simulation to real-world industrial process control, providing practical guidelines for improved deployment.

Contribution

It offers a comprehensive analysis of MDP design factors affecting sim-to-real transfer in RL, validated through experiments on a color mixing task and real hardware.

Findings

01

Physics-based dynamics models significantly improve real-world success rates.

02

Simplified models often fail under strict precision constraints.

03

Guidelines for MDP design enhance RL deployment in industrial settings.

Abstract

Reinforcement Learning (RL) has demonstrated strong potential for industrial process control, yet policies trained in simulation often suffer from a significant sim-to-real gap when deployed on physical hardware. This work systematically analyzes how core Markov Decision Process (MDP) design choices -- state composition, target inclusion, reward formulation, termination criteria, and environment dynamics models -- affect this transfer. Using a color mixing task, we evaluate different MDP configurations and mixing dynamics across simulation and real-world experiments. We validate our findings on physical hardware, demonstrating that physics-based dynamics models achieve up to 50% real-world success under strict precision constraints where simplified models fail entirely. Our results provide practical MDP design guidelines for deploying RL in industrial process control.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Robot Manipulation and Learning