S-EPOA: Overcoming the Indistinguishability of Segments with Skill-Driven Preference-Based Reinforcement Learning

Ni Mu; Yao Luan; Yiqin Yang; Bo Xu; Qing-shan Jia

arXiv:2408.12130·cs.AI·May 14, 2025

S-EPOA: Overcoming the Indistinguishability of Segments with Skill-Driven Preference-Based Reinforcement Learning

Ni Mu, Yao Luan, Yiqin Yang, Bo Xu, Qing-shan Jia

PDF

Open Access

TL;DR

This paper introduces S-EPOA, a skill-driven preference-based reinforcement learning method that overcomes segment indistinguishability by integrating skill mechanisms, leading to improved robustness and efficiency in robotic tasks.

Contribution

The paper presents S-EPOA, a novel algorithm that incorporates skill learning and a new query mechanism to enhance preference-based reinforcement learning.

Findings

01

S-EPOA outperforms traditional PbRL in robotic manipulation and locomotion tasks.

02

Skill integration improves robustness and learning efficiency.

03

Experimental results validate the effectiveness of the proposed method.

Abstract

Preference-based reinforcement learning (PbRL) stands out by utilizing human preferences as a direct reward signal, eliminating the need for intricate reward engineering. However, despite its potential, traditional PbRL methods are often constrained by the indistinguishability of segments, which impedes the learning process. In this paper, we introduce Skill-Enhanced Preference Optimization Algorithm (S-EPOA), which addresses the segment indistinguishability issue by integrating skill mechanisms into the preference learning framework. Specifically, we first conduct the unsupervised pretraining to learn useful skills. Then, we propose a novel query selection mechanism to balance the information gain and distinguishability over the learned skill space. Experimental results on a range of tasks, including robotic manipulation and locomotion, demonstrate that S-EPOA significantly outperforms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Multi-Agent Systems and Negotiation · Reinforcement Learning in Robotics