Hierarchical and Modular Network on Non-prehensile Manipulation in General Environments

Yoonyoung Cho; Junhyek Han; Jisu Han; Beomjoon Kim

arXiv:2502.20843·cs.RO·June 23, 2025

Hierarchical and Modular Network on Non-prehensile Manipulation in General Environments

Yoonyoung Cho, Junhyek Han, Jisu Han, Beomjoon Kim

PDF

TL;DR

This paper introduces a modular neural network architecture and environment generation method enabling robots to perform non-prehensile manipulation tasks in diverse, unseen environments with zero-shot transfer, advancing generalist robotic manipulation capabilities.

Contribution

It proposes a reconfigurable network architecture and environment generation algorithm to improve generalization of non-prehensile manipulation policies across varied environments.

Findings

01

Zero-shot transfer to real-world environments achieved

02

A new benchmark with nine digital twin scenes released

03

Enhanced policy adaptability to diverse geometric constraints

Abstract

For robots to operate in general environments like households, they must be able to perform non-prehensile manipulation actions such as toppling and rolling to manipulate ungraspable objects. However, prior works on non-prehensile manipulation cannot yet generalize across environments with diverse geometries. The main challenge lies in adapting to varying environmental constraints: within a cabinet, the robot must avoid walls and ceilings; to lift objects to the top of a step, the robot must account for the step's pose and extent. While deep reinforcement learning (RL) has demonstrated impressive success in non-prehensile manipulation, accounting for such variability presents a challenge for the generalist policy, as it must learn diverse strategies for each new combination of constraints. To address this, we propose a modular and reconfigurable architecture that adaptively reconfigures…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.