Reward Machines for Deep RL in Noisy and Uncertain Environments

Andrew C. Li; Zizhao Chen; Toryn Q. Klassen; Pashootan Vaezipoor,; Rodrigo Toro Icarte; Sheila A. McIlraith

arXiv:2406.00120·cs.LG·January 16, 2025·1 cites

Reward Machines for Deep RL in Noisy and Uncertain Environments

Andrew C. Li, Zizhao Chen, Toryn Q. Klassen, Pashootan Vaezipoor,, Rodrigo Toro Icarte, Sheila A. McIlraith

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates how Reward Machines can be used to improve deep reinforcement learning in environments with noise and uncertainty, by exploiting task structure despite partial observability and noisy sensing.

Contribution

It introduces RL algorithms that leverage Reward Machine structures under uncertain domain interpretations, addressing challenges in noisy, real-world environments.

Findings

01

Naive approaches fail under noisy conditions.

02

Structured reward representations improve learning efficiency.

03

Task structure can be exploited despite noisy domain vocabularies.

Abstract

Reward Machines provide an automaton-inspired structure for specifying instructions, safety constraints, and other temporally extended reward-worthy behaviour. By exposing the underlying structure of a reward function, they enable the decomposition of an RL task, leading to impressive gains in sample efficiency. Although Reward Machines and similar formal specifications have a rich history of application towards sequential decision-making problems, they critically rely on a ground-truth interpretation of the domain-specific vocabulary that forms the building blocks of the reward function--such ground-truth interpretations are elusive in the real world due in part to partial observability and noisy sensing. In this work, we explore the use of Reward Machines for Deep RL in noisy and uncertain environments. We characterize this problem as a POMDP and propose a suite of RL algorithms that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

andrewli77/reward-machines-noisy-environments
pytorchOfficial

Videos

Reward Machines for Deep RL in Noisy and Uncertain Environments· slideslive

Taxonomy

TopicsElectrostatic Discharge in Electronics · Advanced Memory and Neural Computing · Low-power high-performance VLSI design