Move-Then-Operate: Behavioral Phasing for Human-Like Robotic Manipulation

Haoming Xu; Lei Lei; Jie Gu; Chu Tang; Jingmin Chen; Ruiqi Wang

arXiv:2604.23620·cs.RO·April 28, 2026

Move-Then-Operate: Behavioral Phasing for Human-Like Robotic Manipulation

Haoming Xu, Lei Lei, Jie Gu, Chu Tang, Jingmin Chen, Ruiqi Wang

PDF

TL;DR

The paper introduces Move-Then-Operate, a dual-phase robotic manipulation framework that improves success rates and training efficiency by explicitly separating movement and contact phases with a learnable phase selector.

Contribution

It proposes a novel dual-expert policy architecture with automatic phase labeling, enhancing manipulation performance and data efficiency over monolithic approaches.

Findings

01

Achieves 68.9% success rate on RoboTwin2 benchmark.

02

Outperforms monolithic baseline by 24%.

03

Reaches peak performance in 40% fewer training steps.

Abstract

We present Move-Then-Operate, a Vision language action framework that explicitly decouples robotic manipulation into two distinct behavioral phases: coarse relocation (move) and contact-critical interaction (operate). Unlike monolithic policies that conflate these heterogeneous regimes, our architecture employs a dual-expert policy routed by a learnable phase selector, introducing a structural inductive bias that isolates phase-specific dynamics. Phase labels are automatically generated via an MLLM-based pipeline conditioned on lightweight contextual cues such as end-effector velocity and subtask decomposition to ensure alignment with human motor patterns. Evaluated on the RoboTwin2 benchmark, our method achieves an average success rate of $68.9%$ , outperforming the monolithic $π_{0}$ baseline by $24%$ . It matches or exceeds models trained on $10 \times$ more data and reaches peak…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.