Enabling Pareto-Stationarity Exploration in Multi-Objective Reinforcement Learning: A Multi-Objective Weighted-Chebyshev Actor-Critic Approach

Fnu Hairi; Jiao Yang; Tianchen Zhou; Haibo Yang; Chaosheng Dong; Fan Yang; Michinari Momma; Yan Gao; Jia Liu

arXiv:2507.21397·cs.LG·July 30, 2025

Enabling Pareto-Stationarity Exploration in Multi-Objective Reinforcement Learning: A Multi-Objective Weighted-Chebyshev Actor-Critic Approach

Fnu Hairi, Jiao Yang, Tianchen Zhou, Haibo Yang, Chaosheng Dong, Fan Yang, Michinari Momma, Yan Gao, Jia Liu

PDF

TL;DR

This paper introduces MOCHA, a novel algorithm for multi-objective reinforcement learning that systematically explores Pareto-stationary solutions with finite-time sample complexity guarantees, demonstrated to outperform baselines in simulations.

Contribution

The paper proposes MOCHA, integrating weighted-Chebychev and actor-critic methods, to explore Pareto-stationary solutions in MORL with theoretical sample complexity guarantees.

Findings

01

MOCHA achieves $ ilde{O}(rac{1}{ extepsilon^2})$ sample complexity.

02

MOCHA outperforms baseline MORL algorithms in simulations.

03

Sample complexity depends on the minimum weight entry $p_{min}$.

Abstract

In many multi-objective reinforcement learning (MORL) applications, being able to systematically explore the Pareto-stationary solutions under multiple non-convex reward objectives with theoretical finite-time sample complexity guarantee is an important and yet under-explored problem. This motivates us to take the first step and fill the important gap in MORL. Specifically, in this paper, we propose a \uline{M}ulti-\uline{O}bjective weighted-\uline{CH}ebyshev \uline{A}ctor-critic (MOCHA) algorithm for MORL, which judiciously integrates the weighted-Chebychev (WC) and actor-critic framework to enable Pareto-stationarity exploration systematically with finite-time sample complexity guarantee. Sample complexity result of MOCHA algorithm reveals an interesting dependency on $p_{m i n}$ in finding an $ϵ$ -Pareto-stationary solution, where $p_{m i n}$ denotes the minimum entry of a given…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.