S-EPOA: Overcoming the Indistinguishability of Segments with Skill-Driven Preference-Based Reinforcement Learning
Ni Mu, Yao Luan, Yiqin Yang, Bo Xu, Qing-shan Jia

TL;DR
This paper introduces S-EPOA, a skill-driven preference-based reinforcement learning method that overcomes segment indistinguishability by integrating skill mechanisms, leading to improved robustness and efficiency in robotic tasks.
Contribution
The paper presents S-EPOA, a novel algorithm that incorporates skill learning and a new query mechanism to enhance preference-based reinforcement learning.
Findings
S-EPOA outperforms traditional PbRL in robotic manipulation and locomotion tasks.
Skill integration improves robustness and learning efficiency.
Experimental results validate the effectiveness of the proposed method.
Abstract
Preference-based reinforcement learning (PbRL) stands out by utilizing human preferences as a direct reward signal, eliminating the need for intricate reward engineering. However, despite its potential, traditional PbRL methods are often constrained by the indistinguishability of segments, which impedes the learning process. In this paper, we introduce Skill-Enhanced Preference Optimization Algorithm (S-EPOA), which addresses the segment indistinguishability issue by integrating skill mechanisms into the preference learning framework. Specifically, we first conduct the unsupervised pretraining to learn useful skills. Then, we propose a novel query selection mechanism to balance the information gain and distinguishability over the learned skill space. Experimental results on a range of tasks, including robotic manipulation and locomotion, demonstrate that S-EPOA significantly outperforms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Multi-Agent Systems and Negotiation · Reinforcement Learning in Robotics
