LogicEnvGen: Task-Logic Driven Generation of Diverse Simulated Environments for Embodied AI
Jianan Wang, Siyang Zhang, Bin Li, Juan Chen, Jingtao Qi, Zhuo Zhang, Chen Qian

TL;DR
LogicEnvGen is a new LLM-driven method that generates logically diverse simulated environments to better evaluate embodied AI agents, addressing the lack of logical variety in existing environment generation approaches.
Contribution
It introduces LogicEnvGen, a top-down, logic-based environment generation framework, and LogicEnvEval, a benchmark with metrics for evaluating logical diversity in environments.
Findings
LogicEnvGen achieves 1.04-2.61x greater diversity than baselines.
It significantly improves fault detection in agents by 4-68%.
The method ensures physical plausibility through constraint solving.
Abstract
Simulated environments play an essential role in embodied AI, functionally analogous to test cases in software engineering. However, existing environment generation methods often emphasize visual realism (e.g., object diversity and layout coherence), overlooking a crucial aspect: logical diversity from the testing perspective. This limits the comprehensive evaluation of agent adaptability and planning robustness in distinct simulated environments. To bridge this gap, we propose LogicEnvGen, a novel method driven by Large Language Models (LLMs) that adopts a top-down paradigm to generate logically diverse simulated environments as test cases for agents. Given an agent task, LogicEnvGen first analyzes its execution logic to construct decision-tree-structured behavior plans and then synthesizes a set of logical trajectories. Subsequently, it adopts a heuristic algorithm to refine the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Games · Multimodal Machine Learning Applications · Software Testing and Debugging Techniques
