On the Evaluation of Generative Robotic Simulations

Feng Chen; Botian Xu; Pu Hua; Peiqi Duan; Yanchao Yang; Yi Ma; Huazhe; Xu

arXiv:2410.08172·cs.RO·October 11, 2024

On the Evaluation of Generative Robotic Simulations

Feng Chen, Botian Xu, Pu Hua, Peiqi Duan, Yanchao Yang, Yi Ma, Huazhe, Xu

PDF

Open Access

TL;DR

This paper introduces a comprehensive evaluation framework for generative robotic simulations, focusing on quality, diversity, and generalization, validated through experiments aligning with human assessments.

Contribution

It proposes a novel, multi-faceted evaluation framework specifically designed for generative robotic tasks, addressing a key challenge in the field.

Findings

01

Metrics for quality and diversity can be optimized separately.

02

No single method excels across all evaluation metrics.

03

Current models face significant challenges in zero-shot generalization.

Abstract

Due to the difficulty of acquiring extensive real-world data, robot simulation has become crucial for parallel training and sim-to-real transfer, highlighting the importance of scalable simulated robotic tasks. Foundation models have demonstrated impressive capacities in autonomously generating feasible robotic tasks. However, this new paradigm underscores the challenge of adequately evaluating these autonomously generated tasks. To address this, we propose a comprehensive evaluation framework tailored to generative simulations. Our framework segments evaluation into three core aspects: quality, diversity, and generalization. For single-task quality, we evaluate the realism of the generated task and the completeness of the generated trajectories using large language models and vision-language models. In terms of diversity, we measure both task and data diversity through text similarity…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsManufacturing Process and Optimization · Modular Robots and Swarm Intelligence

MethodsFocus