User-Oriented Robust Reinforcement Learning

Haoyi You; Beichen Yu; Haiming Jin; Zhaoxing Yang; Jiahui Sun

arXiv:2202.07301·cs.LG·December 13, 2022

User-Oriented Robust Reinforcement Learning

Haoyi You, Beichen Yu, Haiming Jin, Zhaoxing Yang, Jiahui Sun

PDF

Open Access 1 Video

TL;DR

This paper introduces a user-oriented robust reinforcement learning framework that incorporates user preferences into policy optimization, balancing robustness and personalization, and demonstrates superior performance in MuJoCo tasks.

Contribution

It proposes a novel UOR metric for RL, develops algorithms for different environment distribution knowledge scenarios, and proves their convergence to near-optimal policies.

Findings

01

UOR-RL achieves state-of-the-art results under the UOR metric.

02

UOR-RL performs comparably to baselines on average and worst-case metrics.

03

Theoretical convergence guarantees are provided for the proposed algorithms.

Abstract

Recently, improving the robustness of policies across different environments attracts increasing attention in the reinforcement learning (RL) community. Existing robust RL methods mostly aim to achieve the max-min robustness by optimizing the policy's performance in the worst-case environment. However, in practice, a user that uses an RL policy may have different preferences over its performance across environments. Clearly, the aforementioned max-min robustness is oftentimes too conservative to satisfy user preference. Therefore, in this paper, we integrate user preference into policy learning in robust RL, and propose a novel User-Oriented Robust RL (UOR-RL) framework. Specifically, we define a new User-Oriented Robustness (UOR) metric for RL, which allocates different weights to the environments according to user preference and generalizes the max-min robustness metric. To optimize…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

User-Oriented Robust Reinforcement Learning· underline

Taxonomy

TopicsReinforcement Learning in Robotics · Mobile Crowdsensing and Crowdsourcing · Smart Parking Systems Research