Splat-MOVER: Multi-Stage, Open-Vocabulary Robotic Manipulation via Editable Gaussian Splatting
Ola Shorinwa, Johnathan Tucker, Aliyah Smith, Aiden Swann, Timothy, Chen, Roya Firoozi, Monroe Kennedy III, Mac Schwager

TL;DR
Splat-MOVER introduces a modular robotic manipulation system that uses editable Gaussian Splatting scene representations for real-time, open-vocabulary, multi-stage tasks, enabling scene understanding, editing, and grasp planning in dynamic environments.
Contribution
It presents a novel multi-stage manipulation framework leveraging Gaussian Splatting for scene understanding, editing, and grasp generation, with real-time operation and demonstrated hardware performance.
Findings
Outperforms recent baselines in open-vocabulary manipulation tasks
Enables real-time scene editing and grasp planning during manipulation
Demonstrated on Kinova robot with multiple multi-stage tasks
Abstract
We present Splat-MOVER, a modular robotics stack for open-vocabulary robotic manipulation, which leverages the editability of Gaussian Splatting (GSplat) scene representations to enable multi-stage manipulation tasks. Splat-MOVER consists of: (i) ASK-Splat, a GSplat representation that distills semantic and grasp affordance features into the 3D scene. ASK-Splat enables geometric, semantic, and affordance understanding of 3D scenes, which is critical in many robotics tasks; (ii) SEE-Splat, a real-time scene-editing module using 3D semantic masking and infilling to visualize the motions of objects that result from robot interactions in the real-world. SEE-Splat creates a "digital twin" of the evolving environment throughout the manipulation task; and (iii) Grasp-Splat, a grasp generation module that uses ASK-Splat and SEE-Splat to propose affordance-aligned candidate grasps for open-world…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Natural Language Processing Techniques · Multimodal Machine Learning Applications
