Prioritized Soft Q-Decomposition for Lexicographic Reinforcement   Learning

Finn Rietz; Erik Schaffernicht; Stefan Heinrich; Johannes Andreas; Stork

arXiv:2310.02360·cs.AI·May 3, 2024

Prioritized Soft Q-Decomposition for Lexicographic Reinforcement Learning

Finn Rietz, Erik Schaffernicht, Stefan Heinrich, Johannes Andreas, Stork

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces PSQD, a novel algorithm for lexicographic multi-objective reinforcement learning that efficiently learns, reuses, and adapts subtask solutions in continuous spaces without conflicting priorities.

Contribution

The paper proposes prioritized soft Q-decomposition (PSQD), enabling zero-shot reuse and offline adaptation of subtask solutions under lexicographic priorities in continuous RL tasks.

Findings

01

Successful learning and adaptation in robot control tasks

02

Effective reuse of subtask solutions without additional environment interaction

03

Maintains subtask priorities during learning, outperforming baselines

Abstract

Reinforcement learning (RL) for complex tasks remains a challenge, primarily due to the difficulties of engineering scalar reward functions and the inherent inefficiency of training models from scratch. Instead, it would be better to specify complex tasks in terms of elementary subtasks and to reuse subtask solutions whenever possible. In this work, we address continuous space lexicographic multi-objective RL problems, consisting of prioritized subtasks, which are notoriously difficult to solve. We show that these can be scalarized with a subtask transformation and then solved incrementally using value decomposition. Exploiting this insight, we propose prioritized soft Q-decomposition (PSQD), a novel algorithm for learning and adapting subtask solutions under lexicographic priorities in continuous state-action spaces. PSQD offers the ability to reuse previously learned subtask solutions…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

frietz58/psqd
pytorchOfficial

Videos

Prioritized Soft Q-Decomposition for Lexicographic Reinforcement Learning· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Software Engineering Research