Model-based Reinforcement Learning for Semi-Markov Decision Processes with Neural ODEs
Jianzhun Du, Joseph Futoma, Finale Doshi-Velez

TL;DR
This paper introduces neural ODE-based models for semi-Markov decision processes, enabling accurate continuous-time dynamics modeling and efficient policy learning with limited data, along with optimizing interaction schedules.
Contribution
It presents novel neural ODE models for SMDPs and a model-based approach for optimizing time schedules to reduce interactions while maintaining performance.
Findings
Models accurately characterize continuous-time dynamics.
High-performing policies achieved with limited data.
Optimized time schedules reduce environment interactions.
Abstract
We present two elegant solutions for modeling continuous-time dynamics, in a novel model-based reinforcement learning (RL) framework for semi-Markov decision processes (SMDPs), using neural ordinary differential equations (ODEs). Our models accurately characterize continuous-time dynamics and enable us to develop high-performing policies using a small amount of data. We also develop a model-based approach for optimizing time schedules to reduce interaction rates with the environment while maintaining the near-optimal performance, which is not possible for model-free methods. We experimentally demonstrate the efficacy of our methods across various continuous-time domains.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Data Stream Mining Techniques · Elevator Systems and Control
