LPPG-RL: Lexicographically Projected Policy Gradient Reinforcement Learning with Subproblem Exploration

Ruiyu Qiu; Rui Wang; Guanghui Yang; Xiang Li; Zhijiang Shao

arXiv:2511.08339·cs.LG·November 12, 2025

LPPG-RL: Lexicographically Projected Policy Gradient Reinforcement Learning with Subproblem Exploration

Ruiyu Qiu, Rui Wang, Guanghui Yang, Xiang Li, Zhijiang Shao

PDF

Open Access

TL;DR

This paper introduces LPPG-RL, a novel reinforcement learning framework for lexicographic multi-objective problems that efficiently enforces priority orderings and improves convergence in continuous spaces.

Contribution

LPPG-RL employs sequential gradient projections with Dykstra's method and introduces Subproblem Exploration to enhance stability and speed in continuous multi-objective RL.

Findings

01

Outperforms state-of-the-art LMORL methods in 2D navigation tasks.

02

Provides theoretical convergence guarantees and policy improvement bounds.

03

Demonstrates effectiveness in continuous policy spaces.

Abstract

Lexicographic multi-objective problems, which consist of multiple conflicting subtasks with explicit priorities, are common in real-world applications. Despite the advantages of Reinforcement Learning (RL) in single tasks, extending conventional RL methods to prioritized multiple objectives remains challenging. In particular, traditional Safe RL and Multi-Objective RL (MORL) methods have difficulty enforcing priority orderings efficiently. Therefore, Lexicographic Multi-Objective RL (LMORL) methods have been developed to address these challenges. However, existing LMORL methods either rely on heuristic threshold tuning with prior knowledge or are restricted to discrete domains. To overcome these limitations, we propose Lexicographically Projected Policy Gradient RL (LPPG-RL), a novel LMORL framework which leverages sequential gradient projections to identify feasible policy update…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Multimodal Machine Learning Applications · Advanced Multi-Objective Optimization Algorithms