TL;DR
SpaceMind is a modular, self-evolving vision-language framework enabling autonomous on-orbit servicing with high robustness and zero-code transfer from simulation to physical robots.
Contribution
It introduces a novel self-evolving, modular VLM framework with a unified interface for simulation and real hardware, enhancing robustness and adaptability.
Findings
Achieved 90-100% navigation success under nominal conditions.
Demonstrated robustness in degraded conditions with unique success in search-and-approach tasks.
Self-evolution improved success rates from failure to high success in multiple scenarios.
Abstract
Autonomous on-orbit servicing demands embodied agents that perceive through visual sensors, reason about 3D spatial situations, and execute multi-phase tasks over extended horizons. We present SpaceMind, a modular and self-evolving vision-language model (VLM) agent framework that decomposes knowledge, tools, and reasoning into three independently extensible dimensions: skill modules with dynamic routing, Model Context Protocol (MCP) tools with configurable profiles, and injectable reasoning-mode skills. An MCP-Redis interface layer enables the same codebase to operate across simulation and physical hardware without modification, and a Skill Self-Evolution mechanism distills operational experience into persistent skill files without model fine-tuning. We validate SpaceMind through 192 closed-loop runs across five satellites, three task types, and two environments, a UE5 simulation and a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
