Robot-R1: Reinforcement Learning for Enhanced Embodied Reasoning in Robotics
Dongyoung Kim, Sumin Park, Huiwon Jang, Jinwoo Shin, Jaehyung Kim, Younggyo Seo

TL;DR
Robot-R1 introduces reinforcement learning to improve embodied reasoning in robotics, addressing limitations of supervised fine-tuning by enhancing control and generalization, and outperforming existing methods including GPT-4o on key tasks.
Contribution
The paper presents Robot-R1, a reinforcement learning framework that improves embodied reasoning for robot control, overcoming SFT limitations and introducing a new benchmark for evaluation.
Findings
Robot-R1 outperforms SFT methods on embodied reasoning tasks.
Robot-R1 surpasses GPT-4o on low-level action control reasoning.
Models trained with Robot-R1 demonstrate enhanced generalization and accuracy.
Abstract
Large Vision-Language Models (LVLMs) have recently shown great promise in advancing robotics by combining embodied reasoning with robot control. A common approach involves training on embodied reasoning tasks related to robot control using Supervised Fine-Tuning (SFT). However, SFT datasets are often heuristically constructed and not explicitly optimized for improving robot control. Furthermore, SFT often leads to issues such as catastrophic forgetting and reduced generalization performance. To address these limitations, we introduce Robot-R1, a novel framework that leverages reinforcement learning to enhance embodied reasoning specifically for robot control. Robot-R1 learns to predict the next keypoint state required for task completion, conditioned on the current scene image and environment metadata derived from expert demonstrations. Inspired by the DeepSeek-R1 learning approach,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
MethodsShrink and Fine-Tune
