LPPG-RL: Lexicographically Projected Policy Gradient Reinforcement Learning with Subproblem Exploration
Ruiyu Qiu, Rui Wang, Guanghui Yang, Xiang Li, Zhijiang Shao

TL;DR
This paper introduces LPPG-RL, a novel reinforcement learning framework for lexicographic multi-objective problems that efficiently enforces priority orderings and improves convergence in continuous spaces.
Contribution
LPPG-RL employs sequential gradient projections with Dykstra's method and introduces Subproblem Exploration to enhance stability and speed in continuous multi-objective RL.
Findings
Outperforms state-of-the-art LMORL methods in 2D navigation tasks.
Provides theoretical convergence guarantees and policy improvement bounds.
Demonstrates effectiveness in continuous policy spaces.
Abstract
Lexicographic multi-objective problems, which consist of multiple conflicting subtasks with explicit priorities, are common in real-world applications. Despite the advantages of Reinforcement Learning (RL) in single tasks, extending conventional RL methods to prioritized multiple objectives remains challenging. In particular, traditional Safe RL and Multi-Objective RL (MORL) methods have difficulty enforcing priority orderings efficiently. Therefore, Lexicographic Multi-Objective RL (LMORL) methods have been developed to address these challenges. However, existing LMORL methods either rely on heuristic threshold tuning with prior knowledge or are restricted to discrete domains. To overcome these limitations, we propose Lexicographically Projected Policy Gradient RL (LPPG-RL), a novel LMORL framework which leverages sequential gradient projections to identify feasible policy update…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Multimodal Machine Learning Applications · Advanced Multi-Objective Optimization Algorithms
