H-GAR: A Hierarchical Interaction Framework via Goal-Driven Observation-Action Refinement for Robotic Manipulation
Yijie Zhu, Rui Shao, Ziyang Liu, Jie He, Jizhihui Liu, Jiuru Wang, Zitong Yu

TL;DR
H-GAR introduces a hierarchical, goal-driven framework for robotic manipulation that refines actions through explicit observation-action interaction, leading to more accurate and coherent manipulation behaviors.
Contribution
The paper presents H-GAR, a novel hierarchical interaction framework that integrates goal-conditioned observation synthesis and action refinement for improved robotic manipulation.
Findings
Achieves state-of-the-art performance in robotic manipulation tasks.
Effectively refines coarse actions into goal-consistent fine actions.
Demonstrates robustness in both simulation and real-world environments.
Abstract
Unified video and action prediction models hold great potential for robotic manipulation, as future observations offer contextual cues for planning, while actions reveal how interactions shape the environment. However, most existing approaches treat observation and action generation in a monolithic and goal-agnostic manner, often leading to semantically misaligned predictions and incoherent behaviors. To this end, we propose H-GAR, a Hierarchical interaction framework via Goal-driven observation-Action Refinement.To anchor prediction to the task objective, H-GAR first produces a goal observation and a coarse action sketch that outline a high-level route toward the goal. To enable explicit interaction between observation and action under the guidance of the goal observation for more coherent decision-making, we devise two synergistic modules. (1) Goal-Conditioned Observation Synthesizer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsRobot Manipulation and Learning · Human Motion and Animation · Reinforcement Learning in Robotics
