Static and Plugged: Make Embodied Evaluation Simple

Jiahao Xiao; Jianbo Zhang; BoWen Yan; Shengyu Guo; Tongrui Ye; Kaiwei Zhang; Zicheng Zhang; Xiaohong Liu; Zhengxue Cheng; Lei Fan; Chuyi Li; Guangtao Zhai

arXiv:2508.06553·cs.CV·August 12, 2025

Static and Plugged: Make Embodied Evaluation Simple

Jiahao Xiao, Jianbo Zhang, BoWen Yan, Shengyu Guo, Tongrui Ye, Kaiwei Zhang, Zicheng Zhang, Xiaohong Liu, Zhengxue Cheng, Lei Fan, Chuyi Li, Guangtao Zhai

PDF

1 Datasets

TL;DR

This paper introduces StaticEmbodiedBench, a scalable, unified static benchmark for evaluating embodied intelligence across diverse scenarios, enabling efficient assessment of vision-language models with a simple interface.

Contribution

It presents a novel static benchmark for embodied intelligence evaluation, covering multiple scenarios and dimensions, and establishes the first static leaderboard for VLMs and VLAs.

Findings

01

Evaluated 19 VLMs and 11 VLAs on the benchmark.

02

Established a unified static leaderboard for embodied intelligence.

03

Released a subset of benchmark samples to facilitate research.

Abstract

Embodied intelligence is advancing rapidly, driving the need for efficient evaluation. Current benchmarks typically rely on interactive simulated environments or real-world setups, which are costly, fragmented, and hard to scale. To address this, we introduce StaticEmbodiedBench, a plug-and-play benchmark that enables unified evaluation using static scene representations. Covering 42 diverse scenarios and 8 core dimensions, it supports scalable and comprehensive assessment through a simple interface. Furthermore, we evaluate 19 Vision-Language Models (VLMs) and 11 Vision-Language-Action models (VLAs), establishing the first unified static leaderboard for Embodied intelligence. Moreover, we release a subset of 200 samples from our benchmark to accelerate the development of embodied intelligence.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

xiaojiahao/StaticEmbodiedBench
dataset· 68 dl
68 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.