RoboTidy : A 3D Gaussian Splatting Household Tidying Benchmark for Embodied Navigation and Action
Xiaoquan Sun, Ruijian Zhang, Kang Pang, Bingchen Miao, Yuxiang Tan, Zhen Yang, Ming Li, Jiayu Chen

TL;DR
RoboTidy introduces a comprehensive 3D household tidying benchmark supporting language-guided navigation and action, with photorealistic scenes, demonstrations, and real-world deployment to advance embodied AI capabilities.
Contribution
It presents RoboTidy, a unified benchmark with realistic 3D scenes, diverse trajectories, and real-world testing for evaluating language-guided household tidying robots.
Findings
Provides 500 photorealistic 3D scenes with objects and containers.
Includes 6.4k manipulation and 1.5k navigation trajectories for training.
Establishes real-world object tidying benchmark.
Abstract
Household tidying is an important application area, yet current benchmarks neither model user preferences nor support mobility, and they generalize poorly, making it hard to comprehensively assess integrated language-to-action capabilities. To address this, we propose RoboTidy, a unified benchmark for language-guided household tidying that supports Vision-Language-Action (VLA) and Vision-Language-Navigation (VLN) training and evaluation. RoboTidy provides 500 photorealistic 3D Gaussian Splatting (3DGS) household scenes (covering 500 objects and containers) with collisions, formulates tidying as an "Action (Object, Container)" list, and supplies 6.4k high-quality manipulation demonstration trajectories and 1.5k naviagtion trajectories to support both few-shot and large-scale training. We also deploy RoboTidy in the real world for object tidying, establishing an end-to-end benchmark for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Social Robot Interaction and HRI · Robot Manipulation and Learning
