Offline Policy Learning via Skill-step Abstraction for Long-horizon Goal-Conditioned Tasks
Donghoon Kim, Minjong Yoo, Honguk Woo

TL;DR
This paper introduces an offline skill-step abstraction framework (GLvSA) for goal-conditioned policy learning, effectively addressing long-horizon goal challenges and goal distribution shifts by decomposing goals into skill-aligned sub-goals.
Contribution
The paper proposes a novel offline framework that combines skill-based goal decomposition with hierarchical policy learning for long-horizon tasks.
Findings
Outperforms existing methods in maze and kitchen environments.
Achieves competitive zero-shot and few-shot adaptation.
Demonstrates efficiency in long-horizon goal tasks.
Abstract
Goal-conditioned (GC) policy learning often faces a challenge arising from the sparsity of rewards, when confronting long-horizon goals. To address the challenge, we explore skill-based GC policy learning in offline settings, where skills are acquired from existing data and long-horizon goals are decomposed into sequences of near-term goals that align with these skills. Specifically, we present an `offline GC policy learning via skill-step abstraction' framework (GLvSA) tailored for tackling long-horizon GC tasks affected by goal distribution shifts. In the framework, a GC policy is progressively learned offline in conjunction with the incremental modeling of skill-step abstractions on the data. We also devise a GC policy hierarchy that not only accelerates GC policy learning within the framework but also allows for parameter-efficient fine-tuning of the policy. Through experiments with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIoT and Edge/Fog Computing · Reinforcement Learning in Robotics · Ethics and Social Impacts of AI
MethodsALIGN
