Theoretical Study of Conflict-Avoidant Multi-Objective Reinforcement Learning
Yudan Wang, Peiyao Xiao, Hao Ban, Kaiyi Ji, Shaofeng Zou

TL;DR
This paper introduces conflict-avoidant multi-objective reinforcement learning algorithms that improve convergence and performance by dynamically adjusting task weights to mitigate gradient conflicts.
Contribution
It develops two novel algorithms, MTAC-CA and MTAC-FC, with theoretical convergence guarantees and enhanced sample efficiency for multi-task reinforcement learning.
Findings
MTAC-CA achieves $ ilde{O}( ext{epsilon}^{-5})$ sample complexity.
MTAC-FC improves to $ ilde{O}( ext{epsilon}^{-3})$ sample complexity.
Both algorithms outperform existing methods on MT10 benchmark.
Abstract
Multi-task reinforcement learning (MTRL) has shown great promise in many real-world applications. Existing MTRL algorithms often aim to learn a policy that optimizes individual objective functions simultaneously with a given prior preference (or weights) on different tasks. However, these methods often suffer from the issue of \textit{gradient conflict} such that the tasks with larger gradients dominate the update direction, resulting in a performance degeneration on other tasks. In this paper, we develop a novel dynamic weighting multi-task actor-critic algorithm (MTAC) under two options of sub-procedures named as CA and FC in task weight updates. MTAC-CA aims to find a conflict-avoidant (CA) update direction that maximizes the minimum value improvement among tasks, and MTAC-FC targets at a much faster convergence rate. We provide a comprehensive finite-time convergence analysis for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
