Towards Task-Prioritized Policy Composition
Finn Rietz, Erik Schaffernicht, Todor Stoyanov, Johannes A. Stork

TL;DR
This paper introduces a novel framework for task-prioritized policy composition in Reinforcement Learning, inspired by control theory, enabling modular, data-efficient, and safety-critical policy learning.
Contribution
It proposes the concept of the indifferent-space for RL policies and a method for globally optimal policy learning within this space, enhancing modularity and safety.
Findings
Framework facilitates knowledge transfer and modular design.
Ensures high-priority constraint satisfaction.
Enables online learning of optimal policies in the indifferent-space.
Abstract
Combining learned policies in a prioritized, ordered manner is desirable because it allows for modular design and facilitates data reuse through knowledge transfer. In control theory, prioritized composition is realized by null-space control, where low-priority control actions are projected into the null-space of high-priority control actions. Such a method is currently unavailable for Reinforcement Learning. We propose a novel, task-prioritized composition framework for Reinforcement Learning, which involves a novel concept: The indifferent-space of Reinforcement Learning policies. Our framework has the potential to facilitate knowledge transfer and modular design while greatly increasing data efficiency and data reuse for Reinforcement Learning agents. Further, our approach can ensure high-priority constraint satisfaction, which makes it promising for learning in safety-critical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Machine Learning and Data Classification
