Robot-R1: Reinforcement Learning for Enhanced Embodied Reasoning in Robotics

Dongyoung Kim; Sumin Park; Huiwon Jang; Jinwoo Shin; Jaehyung Kim; Younggyo Seo

arXiv:2506.00070·cs.RO·January 19, 2026

Robot-R1: Reinforcement Learning for Enhanced Embodied Reasoning in Robotics

Dongyoung Kim, Sumin Park, Huiwon Jang, Jinwoo Shin, Jaehyung Kim, Younggyo Seo

PDF

Open Access

TL;DR

Robot-R1 introduces reinforcement learning to improve embodied reasoning in robotics, addressing limitations of supervised fine-tuning by enhancing control and generalization, and outperforming existing methods including GPT-4o on key tasks.

Contribution

The paper presents Robot-R1, a reinforcement learning framework that improves embodied reasoning for robot control, overcoming SFT limitations and introducing a new benchmark for evaluation.

Findings

01

Robot-R1 outperforms SFT methods on embodied reasoning tasks.

02

Robot-R1 surpasses GPT-4o on low-level action control reasoning.

03

Models trained with Robot-R1 demonstrate enhanced generalization and accuracy.

Abstract

Large Vision-Language Models (LVLMs) have recently shown great promise in advancing robotics by combining embodied reasoning with robot control. A common approach involves training on embodied reasoning tasks related to robot control using Supervised Fine-Tuning (SFT). However, SFT datasets are often heuristically constructed and not explicitly optimized for improving robot control. Furthermore, SFT often leads to issues such as catastrophic forgetting and reduced generalization performance. To address these limitations, we introduce Robot-R1, a novel framework that leverages reinforcement learning to enhance embodied reasoning specifically for robot control. Robot-R1 learns to predict the next keypoint state required for task completion, conditioned on the current scene image and environment metadata derived from expert demonstrations. Inspired by the DeepSeek-R1 learning approach,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics

MethodsShrink and Fine-Tune