Reinforcement Learning from Delayed Observations via World Models

Armin Karamzade; Kyungmin Kim; Montek Kalsi; Roy Fox

arXiv:2403.12309·cs.LG·June 27, 2024·3 cites

Reinforcement Learning from Delayed Observations via World Models

Armin Karamzade, Kyungmin Kim, Montek Kalsi, Roy Fox

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method using world models to effectively handle delayed observations in reinforcement learning, improving performance in partially observable environments with visual data.

Contribution

It proposes a novel approach to address observation delays by leveraging world models, extending reinforcement learning to delay-aware visual control tasks.

Findings

01

Outperforms naive model-based methods by up to 250%.

02

Successfully applies delay-aware RL to visual environments.

03

Enhances performance in partially observable, delayed settings.

Abstract

In standard reinforcement learning settings, agents typically assume immediate feedback about the effects of their actions after taking them. However, in practice, this assumption may not hold true due to physical constraints and can significantly impact the performance of learning algorithms. In this paper, we address observation delays in partially observable environments. We propose leveraging world models, which have shown success in integrating past observations and learning dynamics, to handle observation delays. By reducing delayed POMDPs to delayed MDPs with world models, our methods can effectively handle partial observability, where existing approaches achieve sub-optimal performance or degrade quickly as observability decreases. Experiments suggest that one of our methods can outperform a naive model-based approach by up to 250%. Moreover, we evaluate our methods on visual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

indylab/delayeddreamer
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Anomaly Detection Techniques and Applications · Data Stream Mining Techniques

MethodsFocus