Theoretical Study of Conflict-Avoidant Multi-Objective Reinforcement   Learning

Yudan Wang; Peiyao Xiao; Hao Ban; Kaiyi Ji; Shaofeng Zou

arXiv:2405.16077·cs.LG·December 24, 2024

Theoretical Study of Conflict-Avoidant Multi-Objective Reinforcement Learning

Yudan Wang, Peiyao Xiao, Hao Ban, Kaiyi Ji, Shaofeng Zou

PDF

Open Access

TL;DR

This paper introduces conflict-avoidant multi-objective reinforcement learning algorithms that improve convergence and performance by dynamically adjusting task weights to mitigate gradient conflicts.

Contribution

It develops two novel algorithms, MTAC-CA and MTAC-FC, with theoretical convergence guarantees and enhanced sample efficiency for multi-task reinforcement learning.

Findings

01

MTAC-CA achieves $ ilde{O}( ext{epsilon}^{-5})$ sample complexity.

02

MTAC-FC improves to $ ilde{O}( ext{epsilon}^{-3})$ sample complexity.

03

Both algorithms outperform existing methods on MT10 benchmark.

Abstract

Multi-task reinforcement learning (MTRL) has shown great promise in many real-world applications. Existing MTRL algorithms often aim to learn a policy that optimizes individual objective functions simultaneously with a given prior preference (or weights) on different tasks. However, these methods often suffer from the issue of \textit{gradient conflict} such that the tasks with larger gradients dominate the update direction, resulting in a performance degeneration on other tasks. In this paper, we develop a novel dynamic weighting multi-task actor-critic algorithm (MTAC) under two options of sub-procedures named as CA and FC in task weight updates. MTAC-CA aims to find a conflict-avoidant (CA) update direction that maximizes the minimum value improvement among tasks, and MTAC-FC targets at a much faster convergence rate. We provide a comprehensive finite-time convergence analysis for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics