SR-Platform: An Agentic Pipeline for Natural Language-Driven Robot Simulation Environment Synthesis
Ben Wei Lim, Minh Duc Le, Thang Truong, Thanh Nguyen Canh

TL;DR
SR-Platform is an agentic system that converts natural language descriptions into executable MuJoCo robot environments, significantly reducing manual effort and enabling scene creation in under a minute.
Contribution
It introduces a novel multi-stage pipeline combining LLMs and asset retrieval to automate and accelerate robot scene synthesis from plain English descriptions.
Findings
Median scene generation latency is approximately 50 seconds.
Cache-accelerated scenes complete in 30-40 seconds.
Asset forge has an 11.3% retry rate with automatic recovery.
Abstract
Generating robot simulation environments remains a major bottleneck in simulation-based robot learning. Constructing a training-ready MuJoCo scene typically requires expertise in 3D asset modeling, MJCF specification, spatial layout, collision avoidance, and robot-model integration. We present SR-Platform, a production-deployed agentic system that converts free-form natural language descriptions into executable, physically valid MuJoCo environments. SR-Platform decomposes scene synthesis into four stages: an LLM-based orchestrator that converts user intent into a structured scene plan; an asset forge that retrieves cached assets or generates new 3D geometry through LLM-to-CadQuery synthesis; a layout architect that assigns object poses and verifies industrial constraints; and a bridge layer that assembles the final MJCF scene and merges the selected robot model. The system is deployed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
