A Robust Policy Bootstrapping Algorithm for Multi-objective   Reinforcement Learning in Non-stationary Environments

Sherif Abdelfattah; Kathryn Kasmarik; Jiankun Hu

arXiv:2308.09734·cs.LG·August 22, 2023

A Robust Policy Bootstrapping Algorithm for Multi-objective Reinforcement Learning in Non-stationary Environments

Sherif Abdelfattah, Kathryn Kasmarik, Jiankun Hu

PDF

Open Access

TL;DR

This paper presents a new multi-objective reinforcement learning algorithm that adaptively evolves policy coverage sets in non-stationary environments, outperforming existing methods in dynamic settings.

Contribution

It introduces a robust, online developmental optimization algorithm for evolving policy coverage sets in non-stationary environments, addressing a key limitation of prior methods.

Findings

01

Outperforms existing algorithms in non-stationary environments

02

Achieves comparable results to state-of-the-art in stationary environments

03

Demonstrates robustness and adaptability in dynamic settings

Abstract

Multi-objective Markov decision processes are a special kind of multi-objective optimization problem that involves sequential decision making while satisfying the Markov property of stochastic processes. Multi-objective reinforcement learning methods address this problem by fusing the reinforcement learning paradigm with multi-objective optimization techniques. One major drawback of these methods is the lack of adaptability to non-stationary dynamics in the environment. This is because they adopt optimization procedures that assume stationarity to evolve a coverage set of policies that can solve the problem. This paper introduces a developmental optimization approach that can evolve the policy coverage set while exploring the preference space over the defined objectives in an online manner. We propose a novel multi-objective reinforcement learning algorithm that can robustly evolve a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Multi-Objective Optimization Algorithms · Reinforcement Learning in Robotics