Reinforcement Learning under Threats

Victor Gallego; Roi Naveiro; David Rios Insua

arXiv:1809.01560·cs.LG·October 28, 2019

Reinforcement Learning under Threats

Victor Gallego, Roi Naveiro, David Rios Insua

PDF

1 Repo

TL;DR

This paper introduces Threatened Markov Decision Processes (TMDPs) and a level-$k$ thinking scheme to enhance reinforcement learning robustness against adversarial threats, supported by theoretical analysis and extensive empirical validation.

Contribution

It presents a novel framework for RL under adversarial threats and a new learning scheme to effectively handle such threats, with theoretical and empirical validation.

Findings

01

Accounting for adversaries improves RL performance.

02

The proposed framework enhances robustness in security-related RL scenarios.

03

Empirical results demonstrate the effectiveness of the approach.

Abstract

In several reinforcement learning (RL) scenarios, mainly in security settings, there may be adversaries trying to interfere with the reward generating process. In this paper, we introduce Threatened Markov Decision Processes (TMDPs), which provide a framework to support a decision maker against a potential adversary in RL. Furthermore, we propose a level- $k$ thinking scheme resulting in a new learning framework to deal with TMDPs. After introducing our framework and deriving theoretical results, relevant empirical evidence is given via extensive experiments, showing the benefits of accounting for adversaries while the agent learns.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vicgalle/ARAMARL
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.