Multi-Objective Deep Reinforcement Learning

Hossam Mossalam; Yannis M. Assael; Diederik M. Roijers; Shimon; Whiteson

arXiv:1610.02707·cs.AI·October 11, 2016·93 cites

Multi-Objective Deep Reinforcement Learning

Hossam Mossalam, Yannis M. Assael, Diederik M. Roijers, Shimon, Whiteson

PDF

Open Access 2 Repos

TL;DR

This paper introduces Deep Optimistic Linear Support Learning (DOL), a novel deep reinforcement learning method capable of handling high-dimensional multi-objective decision problems without prior knowledge of objective importance, and provides a new benchmark for this field.

Contribution

The paper presents the first successful application of deep reinforcement learning to learn multi-objective policies and introduces a benchmark testbed for future research.

Findings

01

DOL computes the convex coverage set for multi-objective problems.

02

First demonstration of deep RL succeeding in multi-objective policy learning.

03

Provides a new benchmark for deep multi-objective reinforcement learning.

Abstract

We propose Deep Optimistic Linear Support Learning (DOL) to solve high-dimensional multi-objective decision problems where the relative importances of the objectives are not known a priori. Using features from the high-dimensional inputs, DOL computes the convex coverage set containing all potential optimal solutions of the convex combinations of the objectives. To our knowledge, this is the first time that deep reinforcement learning has succeeded in learning multi-objective policies. In addition, we provide a testbed with two experiments to be used as a benchmark for deep multi-objective reinforcement learning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Multi-Objective Optimization Algorithms · Adaptive Dynamic Programming Control