Multi-Objective Policy Gradients with Topological Constraints

Kyle Hollins Wray; Stas Tiomkin; Mykel J. Kochenderfer; Pieter Abbeel

arXiv:2209.07096·cs.AI·September 16, 2022

Multi-Objective Policy Gradients with Topological Constraints

Kyle Hollins Wray, Stas Tiomkin, Mykel J. Kochenderfer, Pieter Abbeel

PDF

Open Access

TL;DR

This paper extends topological Markov decision processes to continuous spaces, deriving a policy gradient theorem and implementing a new algorithm that generalizes deep reinforcement learning for multi-objective problems with ordered constraints.

Contribution

It formulates and proves a policy gradient theorem for TMDPs in continuous spaces, enabling new algorithms that incorporate topological constraints into deep RL.

Findings

01

Successful implementation of the TMDP policy gradient algorithm

02

Effective navigation in real-world multi-objective robot tasks

03

Generalization of existing DRL methods to topologically constrained problems

Abstract

Multi-objective optimization models that encode ordered sequential constraints provide a solution to model various challenging problems including encoding preferences, modeling a curriculum, and enforcing measures of safety. A recently developed theory of topological Markov decision processes (TMDPs) captures this range of problems for the case of discrete states and actions. In this work, we extend TMDPs towards continuous spaces and unknown transition dynamics by formulating, proving, and implementing the policy gradient theorem for TMDPs. This theoretical result enables the creation of TMDP learning algorithms that use function approximators, and can generalize existing deep reinforcement learning (DRL) approaches. Specifically, we present a new algorithm for a policy gradient in TMDPs by a simple extension of the proximal policy optimization (PPO) algorithm. We demonstrate this on a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Multi-Objective Optimization Algorithms