Mind and Motion Aligned: A Joint Evaluation IsaacSim Benchmark for Task Planning and Low-Level Policies in Mobile Manipulation

Nikita Kachaev; Andrei Spiridonov; Andrey Gorodetsky; Kirill Muravyev; Nikita Oskolkov; Aditya Narendra; Vlad Shakhuro; Dmitry Makarov; Aleksandr I. Panov; Polina Fedotova; Alexey K. Kovalev

arXiv:2508.15663·cs.RO·August 22, 2025

Mind and Motion Aligned: A Joint Evaluation IsaacSim Benchmark for Task Planning and Low-Level Policies in Mobile Manipulation

Nikita Kachaev, Andrei Spiridonov, Andrey Gorodetsky, Kirill Muravyev, Nikita Oskolkov, Aditya Narendra, Vlad Shakhuro, Dmitry Makarov, Aleksandr I. Panov, Polina Fedotova, Alexey K. Kovalev

PDF

Open Access

TL;DR

This paper introduces Kitchen-R, a comprehensive benchmark in simulated kitchen environments that evaluates integrated task planning and low-level control for mobile manipulators using diverse language instructions.

Contribution

It presents a unified benchmark for evaluating both high-level task planning and low-level control in embodied AI, filling a critical gap in existing robotics benchmarks.

Findings

01

Supports over 500 complex language instructions

02

Provides baseline methods including vision-language planning and diffusion-based control

03

Enables evaluation of planning, control, and integrated systems

Abstract

Benchmarks are crucial for evaluating progress in robotics and embodied AI. However, a significant gap exists between benchmarks designed for high-level language instruction following, which often assume perfect low-level execution, and those for low-level robot control, which rely on simple, one-step commands. This disconnect prevents a comprehensive evaluation of integrated systems where both task planning and physical execution are critical. To address this, we propose Kitchen-R, a novel benchmark that unifies the evaluation of task planning and low-level control within a simulated kitchen environment. Built as a digital twin using the Isaac Sim simulator and featuring more than 500 complex language instructions, Kitchen-R supports a mobile manipulator robot. We provide baseline methods for our benchmark, including a task-planning strategy based on a vision-language model and a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Robot Manipulation and Learning · Social Robot Interaction and HRI