Explanation-Aware Experience Replay in Rule-Dense Environments

Francesco Sovrano; Alex Raymond; Amanda Prorok

arXiv:2109.14711·cs.LG·January 20, 2022

Explanation-Aware Experience Replay in Rule-Dense Environments

Francesco Sovrano, Alex Raymond, Amanda Prorok

PDF

1 Repo

TL;DR

This paper introduces Explanation-Aware Experience Replay (XAER), a method that organizes experience buffers based on rule-based explanations to improve reinforcement learning in rule-dense environments like autonomous driving.

Contribution

It proposes a novel experience replay technique that leverages explainable rules to enhance learning efficiency and performance in complex, rule-rich environments.

Findings

01

XAER outperforms traditional prioritized experience replay methods.

02

Explanation engineering can substitute reward engineering in environments with explainable features.

03

The method is validated across multiple navigation environments and learning tasks.

Abstract

Human environments are often regulated by explicit and complex rulesets. Integrating Reinforcement Learning (RL) agents into such environments motivates the development of learning mechanisms that perform well in rule-dense and exception-ridden environments such as autonomous driving on regulated roads. In this paper, we propose a method for organising experience by means of partitioning the experience buffer into clusters labelled on a per-explanation basis. We present discrete and continuous navigation environments compatible with modular rulesets and 9 learning tasks. For environments with explainable rulesets, we convert rule-based explanations into case-based explanations by allocating state-transitions into clusters labelled with explanations. This allows us to sample experiences in a curricular and task-oriented manner, focusing on the rarity, importance, and meaning of events.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

proroklab/xaer
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Q-Learning · Adam · Experience Replay · Dense Connections · 1x1 Convolution · Clipped Double Q-learning · Deep Q-Network · Dilated Convolution · Target Policy Smoothing