LogicEnvGen: Task-Logic Driven Generation of Diverse Simulated Environments for Embodied AI

Jianan Wang; Siyang Zhang; Bin Li; Juan Chen; Jingtao Qi; Zhuo Zhang; Chen Qian

arXiv:2601.13556·cs.RO·January 21, 2026

LogicEnvGen: Task-Logic Driven Generation of Diverse Simulated Environments for Embodied AI

Jianan Wang, Siyang Zhang, Bin Li, Juan Chen, Jingtao Qi, Zhuo Zhang, Chen Qian

PDF

Open Access

TL;DR

LogicEnvGen is a new LLM-driven method that generates logically diverse simulated environments to better evaluate embodied AI agents, addressing the lack of logical variety in existing environment generation approaches.

Contribution

It introduces LogicEnvGen, a top-down, logic-based environment generation framework, and LogicEnvEval, a benchmark with metrics for evaluating logical diversity in environments.

Findings

01

LogicEnvGen achieves 1.04-2.61x greater diversity than baselines.

02

It significantly improves fault detection in agents by 4-68%.

03

The method ensures physical plausibility through constraint solving.

Abstract

Simulated environments play an essential role in embodied AI, functionally analogous to test cases in software engineering. However, existing environment generation methods often emphasize visual realism (e.g., object diversity and layout coherence), overlooking a crucial aspect: logical diversity from the testing perspective. This limits the comprehensive evaluation of agent adaptability and planning robustness in distinct simulated environments. To bridge this gap, we propose LogicEnvGen, a novel method driven by Large Language Models (LLMs) that adopts a top-down paradigm to generate logically diverse simulated environments as test cases for agents. Given an agent task, LogicEnvGen first analyzes its execution logic to construct decision-tree-structured behavior plans and then synthesizes a set of logical trajectories. Subsequently, it adopts a heuristic algorithm to refine the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games · Multimodal Machine Learning Applications · Software Testing and Debugging Techniques