An Offline Adaptation Framework for Constrained Multi-Objective Reinforcement Learning
Qian Lin, Zongkai Liu, Danying Mo, Chao Yu

TL;DR
This paper introduces an offline adaptation framework for multi-objective reinforcement learning that infers preferences from demonstrations without requiring explicit preference specifications, and extends to safety constraints.
Contribution
It presents a novel offline framework that learns policies from demonstrations without predefined preferences and incorporates safety constraints using safe demonstrations.
Findings
Successfully infers policies aligned with implicit preferences
Effectively handles safety constraints with demonstrations
Demonstrates strong performance on offline multi-objective tasks
Abstract
In recent years, significant progress has been made in multi-objective reinforcement learning (RL) research, which aims to balance multiple objectives by incorporating preferences for each objective. In most existing studies, specific preferences must be provided during deployment to indicate the desired policies explicitly. However, designing these preferences depends heavily on human prior knowledge, which is typically obtained through extensive observation of high-performing demonstrations with expected behaviors. In this work, we propose a simple yet effective offline adaptation framework for multi-objective RL problems without assuming handcrafted target preferences, but only given several demonstrations to implicitly indicate the preferences of expected policies. Additionally, we demonstrate that our framework can naturally be extended to meet constraints on safety-critical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control
MethodsALIGN
