Multi-Policy Pareto Front Tracking Based Online and Offline Multi-Objective Reinforcement Learning

Zeyu Zhao; Yueling Che; Kaichen Liu; Jian Li; Junmei Yao

arXiv:2508.02217·cs.LG·August 5, 2025

Multi-Policy Pareto Front Tracking Based Online and Offline Multi-Objective Reinforcement Learning

Zeyu Zhao, Yueling Che, Kaichen Liu, Jian Li, Junmei Yao

PDF

Open Access

TL;DR

This paper introduces a novel Multi-policy Pareto Front Tracking framework for multi-objective reinforcement learning that improves efficiency and performance by avoiding large policy populations and effectively tracking the Pareto front.

Contribution

The paper proposes the MPFT framework that tracks the Pareto front without maintaining a policy population, applicable to both online and offline MORL, reducing interactions and computational costs.

Findings

01

Superior hypervolume performance over benchmarks

02

Reduced agent-environment interactions

03

Effective Pareto front approximation in robotic tasks

Abstract

Multi-objective reinforcement learning (MORL) plays a pivotal role in addressing multi-criteria decision-making problems in the real world. The multi-policy (MP) based methods are widely used to obtain high-quality Pareto front approximation for the MORL problems. However, traditional MP methods only rely on the online reinforcement learning (RL) and adopt the evolutionary framework with a large policy population. This may lead to sample inefficiency and/or overwhelmed agent-environment interactions in practice. By forsaking the evolutionary framework, we propose the novel Multi-policy Pareto Front Tracking (MPFT) framework without maintaining any policy population, where both online and offline MORL algorithms can be applied. The proposed MPFT framework includes four stages: Stage 1 approximates all the Pareto-vertex policies, whose mapping to the objective space fall on the vertices…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Advanced Multi-Objective Optimization Algorithms