GenEscape: Hierarchical Multi-Agent Generation of Escape Room Puzzles
Mengyi Shan, Brian Curless, Ira Kemelmacher-Shlizerman, Steve Seitz

TL;DR
This paper introduces GenEscape, a hierarchical multi-agent framework that generates visually appealing and functionally coherent escape room puzzle images by decomposing the task into structured stages and enabling agent collaboration.
Contribution
The paper presents a novel multi-agent hierarchical approach that improves the quality and solvability of generated escape room puzzles compared to existing models.
Findings
Agent collaboration enhances puzzle solvability.
The framework reduces shortcuts and improves affordance clarity.
Generated puzzles maintain high visual quality.
Abstract
We challenge text-to-image models with generating escape room puzzle images that are visually appealing, logically solid, and intellectually stimulating. While base image models struggle with spatial relationships and affordance reasoning, we propose a hierarchical multi-agent framework that decomposes this task into structured stages: functional design, symbolic scene graph reasoning, layout synthesis, and local image editing. Specialized agents collaborate through iterative feedback to ensure the scene is visually coherent and functionally solvable. Experiments show that agent collaboration improves output quality in terms of solvability, shortcut avoidance, and affordance clarity, while maintaining visual quality.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Artificial Intelligence in Games
MethodsBalanced Selection
