Reachability Weighted Offline Goal-conditioned Resampling
Wenyan Yang, Joni Pajarinen

TL;DR
This paper introduces Reachability Weighted Sampling (RWS), a method that improves offline goal-conditioned reinforcement learning by prioritizing transitions that are more likely to achieve goals, leading to significant performance gains.
Contribution
The paper proposes RWS, a novel reachability-based sampling method that enhances offline goal-conditioned RL by focusing on reachable transitions, integrated seamlessly with existing algorithms.
Findings
RWS significantly improves performance across six robotic manipulation tasks.
Performance on the HandBlock-Z task increased by nearly 50%.
Reachability-based sampling outperforms uniform sampling in complex environments.
Abstract
Offline goal-conditioned reinforcement learning (RL) relies on fixed datasets where many potential goals share the same state and action spaces. However, these potential goals are not explicitly represented in the collected trajectories. To learn a generalizable goal-conditioned policy, it is common to sample goals and state-action pairs uniformly using dynamic programming methods such as Q-learning. Uniform sampling, however, requires an intractably large dataset to cover all possible combinations and creates many unreachable state-goal-action pairs that degrade policy performance. Our key insight is that sampling should favor transitions that enable goal achievement. To this end, we propose Reachability Weighted Sampling (RWS). RWS uses a reachability classifier trained via positive-unlabeled (PU) learning on goal-conditioned state-action values. The classifier maps these values to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel-Driven Software Engineering Techniques
