PRISM: : Planning and Reasoning with Intent in Simulated Embodied Environments

Yunn Kang Lim; Pengzhan Sun; Ziyi Bai; Xun Xu; Angela Yao; Xulei Yang; Shijie Li

arXiv:2605.11534·cs.RO·May 13, 2026

PRISM: : Planning and Reasoning with Intent in Simulated Embodied Environments

Yunn Kang Lim, Pengzhan Sun, Ziyi Bai, Xun Xu, Angela Yao, Xulei Yang, Shijie Li

PDF

1 Repo

TL;DR

PRISM is a diagnostic benchmark for embodied agents that identifies specific cognitive failures across perception, reasoning, and coordination in photorealistic environments.

Contribution

It introduces a structured, multi-tiered benchmark with an executable API for diverse agents, enabling detailed component-level failure analysis.

Findings

01

Implicit intent resolution is a major bottleneck for all models.

02

Explicit spatial grounding is not the main failure source with oracle perception.

03

Long-horizon coordination is a significant challenge, especially for lightweight models.

Abstract

When an LLM-based embodied agent fails at a household task, the culprit could be misidentified objects, forgotten sub-goals, or poor action sequencing -- yet existing benchmarks report only a single success rate, making it impossible to tell which cognitive module is responsible. We present PRISM, a diagnostic benchmark that reframes this problem: rather than asking only \textit{did the agent succeed?}, PRISM asks \textit{which capability is most likely responsible for failure?} Built on five photorealistic multi-room apartments (4--8 rooms each), PRISM structures 300 human-verified tasks into three capability tiers -- \textit{Basic Ability}, \textit{Reasoning Ability}, and \textit{Long-horizon Ability} -- that isolate perception-to-action grounding, implicit intent resolution, and sustained multi-step coordination respectively. PRISM exposes an agent-agnostic executable action API that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://sj-li.com/PROJ/PRISM
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.