Task Transfer by Preference-Based Cost Learning
Mingxuan Jing, Xiaojian Ma, Wenbing Huang, Fuchun Sun, Huaping Liu

TL;DR
This paper introduces a novel reinforcement learning task transfer method that uses expert preferences instead of explicit demonstrations or cost functions, enabling more practical transfer learning.
Contribution
It proposes a framework that relaxes the need for explicit demonstrations or cost functions by leveraging expert preferences and an iterative learning process.
Findings
Effective in transferring policies without explicit cost functions
Converges reliably under theoretical analysis
Shows superior performance on benchmark tasks
Abstract
The goal of task transfer in reinforcement learning is migrating the action policy of an agent to the target task from the source task. Given their successes on robotic action planning, current methods mostly rely on two requirements: exactly-relevant expert demonstrations or the explicitly-coded cost function on target task, both of which, however, are inconvenient to obtain in practice. In this paper, we relax these two strong conditions by developing a novel task transfer framework where the expert preference is applied as a guidance. In particular, we alternate the following two steps: Firstly, letting experts apply pre-defined preference rules to select related expert demonstrates for the target task. Secondly, based on the selection result, we learn the target cost function and trajectory distribution simultaneously via enhanced Adversarial MaxEnt IRL and generate more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Robot Manipulation and Learning
