Semantic Skill Grounding for Embodied Instruction-Following in Cross-Domain Environments
Sangwoo Shin, Seunghyun Kim, Youngsoo Jang, Moontae Lee, Honguk Woo

TL;DR
This paper introduces SemGro, a hierarchical semantic skill grounding framework that improves embodied instruction-following across diverse domains by leveraging language models' reasoning and multi-modal capabilities.
Contribution
SemGro is a novel hierarchical framework that decomposes and grounds semantic skills in target domains, enhancing cross-domain embodied instruction-following.
Findings
Outperforms baselines in 300 cross-domain scenarios
Effective hierarchical skill decomposition and grounding
Leverages language models for reasoning and feasibility assessment
Abstract
In embodied instruction-following (EIF), the integration of pretrained language models (LMs) as task planners emerges as a significant branch, where tasks are planned at the skill level by prompting LMs with pretrained skills and user instructions. However, grounding these pretrained skills in different domains remains challenging due to their intricate entanglement with the domain-specific knowledge. To address this challenge, we present a semantic skill grounding (SemGro) framework that leverages the hierarchical nature of semantic skills. SemGro recognizes the broad spectrum of these skills, ranging from short-horizon low-semantic skills that are universally applicable across domains to long-horizon rich-semantic skills that are highly specialized and tailored for particular domains. The framework employs an iterative skill decomposition approach, starting from the higher levels of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Online Learning and Analytics · Multimodal Machine Learning Applications
