ODE-based Recurrent Model-free Reinforcement Learning for POMDPs
Xuanle Zhao, Duzhen Zhang, Liyuan Han, Tielin Zhang, Bo Xu

TL;DR
This paper introduces an ODE-based recurrent model within a model-free reinforcement learning framework to effectively infer unobservable information in POMDPs, demonstrating robustness and improved performance in continuous control and meta-RL tasks.
Contribution
It presents a novel ODE-based recurrent approach that enhances information inference in POMDPs within a model-free RL setting, especially under irregular observation sampling.
Findings
Effective in various PO continuous control tasks
Robust against irregularly-sampled observations
Outperforms baseline methods in POMDP scenarios
Abstract
Neural ordinary differential equations (ODEs) are widely recognized as the standard for modeling physical mechanisms, which help to perform approximate inference in unknown physical or biological environments. In partially observable (PO) environments, how to infer unseen information from raw observations puzzled the agents. By using a recurrent policy with a compact context, context-based reinforcement learning provides a flexible way to extract unobservable information from historical transitions. To help the agent extract more dynamics-related information, we present a novel ODE-based recurrent model combines with model-free reinforcement learning (RL) framework to solve partially observable Markov decision processes (POMDPs). We experimentally demonstrate the efficacy of our methods across various PO continuous control and meta-RL tasks. Furthermore, our experiments illustrate that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI)
MethodsParrot optimizer: Algorithm and applications to medical problems
