Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation

Yifu Yuan; Haiqin Cui; Yaoting Huang; Yibin Chen; Fei Ni; Zibin Dong; Pengyi Li; Yan Zheng; Hongyao Tang; Jianye Hao

arXiv:2508.13998·cs.RO·April 7, 2026

Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation

Yifu Yuan, Haiqin Cui, Yaoting Huang, Yibin Chen, Fei Ni, Zibin Dong, Pengyi Li, Yan Zheng, Hongyao Tang, Jianye Hao

PDF

1 Repo 1 Models 1 Datasets 1 Video

TL;DR

Embodied-R1 introduces a unified pointing-based representation and reinforcement learning approach, enabling robust generalization in embodied robotic manipulation tasks without task-specific fine-tuning.

Contribution

The paper pioneers a pointing-centric embodied reasoning framework and a large-scale dataset, achieving state-of-the-art zero-shot performance in robotic manipulation benchmarks.

Findings

01

Embodied-R1 achieves 56.2% success in SIMPLEREnv zero-shot.

02

It reaches 87.5% success across 8 real-world XArm tasks.

03

The model shows high robustness to visual disturbances.

Abstract

Generalization in embodied AI is hindered by the "seeing-to-doing gap," which stems from data scarcity and embodiment heterogeneity. To address this, we pioneer "pointing" as a unified, embodiment-agnostic intermediate representation, defining four core embodied pointing abilities that bridge high-level vision-language comprehension with low-level action primitives. We introduce Embodied-R1, a 3B Vision-Language Model (VLM) specifically designed for embodied reasoning and pointing. We use a wide range of embodied and general visual reasoning datasets as sources to construct a large-scale dataset, Embodied-Points-200K, which supports key embodied pointing capabilities. We then train Embodied-R1 using a two-stage Reinforced Fine-tuning (RFT) curriculum with a specialized multi-task reward design. Embodied-R1 achieves state-of-the-art performance on 11 embodied spatial and pointing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pickxiguapi/Embodied-R1
github

Models

🤗
IffYuan/Embodied-R1-3B-v1
model· 4.2k dl· ♡ 1
4.2k dl♡ 1

Datasets

IffYuan/Embodied-R1-Dataset
dataset· 1.2k dl
1.2k dl

Videos

Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation· slideslive