Dynamic Weights in Multi-Objective Deep Reinforcement Learning

Axel Abels; Diederik M. Roijers; Tom Lenaerts; Ann Now\'e; Denis; Steckelmacher

arXiv:1809.07803·cs.LG·May 14, 2019·41 cites

Dynamic Weights in Multi-Objective Deep Reinforcement Learning

Axel Abels, Diederik M. Roijers, Tom Lenaerts, Ann Now\'e, Denis, Steckelmacher

PDF

Open Access 3 Repos

TL;DR

This paper introduces a multi-objective Q-network conditioned on changing objective weights and a Diverse Experience Replay method, enabling effective deep reinforcement learning in dynamic, multi-objective environments with high-dimensional inputs.

Contribution

It proposes a novel multi-objective Q-network architecture and DER technique to handle dynamic weights and non-stationarity in deep RL settings, extending prior tabular approaches.

Findings

01

Our method outperforms adapted multi-task RL algorithms across various scenarios.

02

The proposed network effectively adapts to changing objective importance.

03

Diverse Experience Replay improves learning stability in non-stationary environments.

Abstract

Many real-world decision problems are characterized by multiple conflicting objectives which must be balanced based on their relative importance. In the dynamic weights setting the relative importance changes over time and specialized algorithms that deal with such change, such as a tabular Reinforcement Learning (RL) algorithm by Natarajan and Tadepalli (2005), are required. However, this earlier work is not feasible for RL settings that necessitate the use of function approximators. We generalize across weight changes and high-dimensional inputs by proposing a multi-objective Q-network whose outputs are conditioned on the relative importance of objectives and we introduce Diverse Experience Replay (DER) to counter the inherent non-stationarity of the Dynamic Weights setting. We perform an extensive experimental evaluation and compare our methods to adapted algorithms from Deep…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics

MethodsExperience Replay