Reinforcement Learning with Partially Known World Dynamics
Christian R. Shelton

TL;DR
This paper introduces PKMDPs, a new framework that integrates known and unknown dynamics in reinforcement learning, enabling the incorporation of domain knowledge to improve learning efficiency in partially observable environments.
Contribution
The paper proposes PKMDPs, a novel framework that allows explicit modeling of known and unknown environment dynamics, along with a reinforcement learning algorithm based on importance sampling.
Findings
Incorporating domain knowledge improves learning efficiency.
The algorithm effectively combines planning with learning in PKMDPs.
Results demonstrate benefits of using known dynamics in reinforcement learning.
Abstract
Reinforcement learning would enjoy better success on real-world problems if domain knowledge could be imparted to the algorithm by the modelers. Most problems have both hidden state and unknown dynamics. Partially observable Markov decision processes (POMDPs) allow for the modeling of both. Unfortunately, they do not provide a natural framework in which to specify knowledge about the domain dynamics. The designer must either admit to knowing nothing about the dynamics or completely specify the dynamics (thereby turning it into a planning problem). We propose a new framework called a partially known Markov decision process (PKMDP) which allows the designer to specify known dynamics while still leaving portions of the environment s dynamics unknown.The model represents NOT ONLY the environment dynamics but also the agents knowledge of the dynamics. We present a reinforcement learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Bayesian Modeling and Causal Inference · Adversarial Robustness in Machine Learning
