Multi-Policy Pareto Front Tracking Based Online and Offline Multi-Objective Reinforcement Learning
Zeyu Zhao, Yueling Che, Kaichen Liu, Jian Li, Junmei Yao

TL;DR
This paper introduces a novel Multi-policy Pareto Front Tracking framework for multi-objective reinforcement learning that improves efficiency and performance by avoiding large policy populations and effectively tracking the Pareto front.
Contribution
The paper proposes the MPFT framework that tracks the Pareto front without maintaining a policy population, applicable to both online and offline MORL, reducing interactions and computational costs.
Findings
Superior hypervolume performance over benchmarks
Reduced agent-environment interactions
Effective Pareto front approximation in robotic tasks
Abstract
Multi-objective reinforcement learning (MORL) plays a pivotal role in addressing multi-criteria decision-making problems in the real world. The multi-policy (MP) based methods are widely used to obtain high-quality Pareto front approximation for the MORL problems. However, traditional MP methods only rely on the online reinforcement learning (RL) and adopt the evolutionary framework with a large policy population. This may lead to sample inefficiency and/or overwhelmed agent-environment interactions in practice. By forsaking the evolutionary framework, we propose the novel Multi-policy Pareto Front Tracking (MPFT) framework without maintaining any policy population, where both online and offline MORL algorithms can be applied. The proposed MPFT framework includes four stages: Stage 1 approximates all the Pareto-vertex policies, whose mapping to the objective space fall on the vertices…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Advanced Multi-Objective Optimization Algorithms
