When Should a Robot Think? Resource-Aware Reasoning via Reinforcement Learning for Embodied Robotic Decision-Making

Jun Liu; Pu Zhao; Zhenglun Kong; Xuan Shen; Peiyan Dong; Fan Yang; Lin Cui; Hao Tang; Geng Yuan; Wei Niu; Wenbin Zhang; Xue Lin; Gaowen Liu; Yanzhi Wang; Dong Huang

arXiv:2603.16673·cs.RO·April 3, 2026

When Should a Robot Think? Resource-Aware Reasoning via Reinforcement Learning for Embodied Robotic Decision-Making

Jun Liu, Pu Zhao, Zhenglun Kong, Xuan Shen, Peiyan Dong, Fan Yang, Lin Cui, Hao Tang, Geng Yuan, Wei Niu, Wenbin Zhang, Xue Lin, Gaowen Liu, Yanzhi Wang, Dong Huang

PDF

TL;DR

This paper introduces RARRL, a hierarchical reinforcement learning framework that enables embodied robots to adaptively decide when and how much to reason, balancing computational resources and task success.

Contribution

RARRL is a novel high-level policy that dynamically orchestrates reasoning and action, improving efficiency and robustness in embodied robotic decision-making.

Findings

01

RARRL improves task success rates over fixed reasoning strategies.

02

It reduces execution latency and increases robustness in robotic tasks.

03

Experiments with ALFRED benchmark validate the effectiveness of adaptive reasoning.

Abstract

Embodied robotic systems increasingly rely on large language model (LLM)-based agents to support high-level reasoning, planning, and decision-making during interactions with the environment. However, invoking LLM reasoning introduces substantial computational latency and resource overhead, which can interrupt action execution and reduce system reliability. Excessive reasoning may delay actions, while insufficient reasoning often leads to incorrect decisions and task failures. This raises a fundamental question for embodied agents: when should the agent reason, and when should it act? In this work, we propose RARRL (Resource-Aware Reasoning via Reinforcement Learning), a hierarchical framework for resource-aware orchestration of embodied agents. Rather than learning low-level control policies, RARRL learns a high-level orchestration policy that operates at the agent's decision-making…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.