ODE-based Recurrent Model-free Reinforcement Learning for POMDPs

Xuanle Zhao; Duzhen Zhang; Liyuan Han; Tielin Zhang; Bo Xu

arXiv:2309.14078·cs.LG·October 31, 2023·2 cites

ODE-based Recurrent Model-free Reinforcement Learning for POMDPs

Xuanle Zhao, Duzhen Zhang, Liyuan Han, Tielin Zhang, Bo Xu

PDF

Open Access 1 Video

TL;DR

This paper introduces an ODE-based recurrent model within a model-free reinforcement learning framework to effectively infer unobservable information in POMDPs, demonstrating robustness and improved performance in continuous control and meta-RL tasks.

Contribution

It presents a novel ODE-based recurrent approach that enhances information inference in POMDPs within a model-free RL setting, especially under irregular observation sampling.

Findings

01

Effective in various PO continuous control tasks

02

Robust against irregularly-sampled observations

03

Outperforms baseline methods in POMDP scenarios

Abstract

Neural ordinary differential equations (ODEs) are widely recognized as the standard for modeling physical mechanisms, which help to perform approximate inference in unknown physical or biological environments. In partially observable (PO) environments, how to infer unseen information from raw observations puzzled the agents. By using a recurrent policy with a compact context, context-based reinforcement learning provides a flexible way to extract unobservable information from historical transitions. To help the agent extract more dynamics-related information, we present a novel ODE-based recurrent model combines with model-free reinforcement learning (RL) framework to solve partially observable Markov decision processes (POMDPs). We experimentally demonstrate the efficacy of our methods across various PO continuous control and meta-RL tasks. Furthermore, our experiments illustrate that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

ODE-based Recurrent Model-free Reinforcement Learning for POMDPs· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI)

MethodsParrot optimizer: Algorithm and applications to medical problems