SpaceMind: A Modular and Self-Evolving Embodied Vision-Language Agent Framework for Autonomous On-orbit Servicing

Aodi Wu; Haodong Han; Xubo Luo; Ruisuo Wang; Shan He; Xue Wan

arXiv:2604.14399·cs.RO·April 17, 2026

SpaceMind: A Modular and Self-Evolving Embodied Vision-Language Agent Framework for Autonomous On-orbit Servicing

Aodi Wu, Haodong Han, Xubo Luo, Ruisuo Wang, Shan He, Xue Wan

PDF

1 Repo

TL;DR

SpaceMind is a modular, self-evolving vision-language framework enabling autonomous on-orbit servicing with high robustness and zero-code transfer from simulation to physical robots.

Contribution

It introduces a novel self-evolving, modular VLM framework with a unified interface for simulation and real hardware, enhancing robustness and adaptability.

Findings

01

Achieved 90-100% navigation success under nominal conditions.

02

Demonstrated robustness in degraded conditions with unique success in search-and-approach tasks.

03

Self-evolution improved success rates from failure to high success in multiple scenarios.

Abstract

Autonomous on-orbit servicing demands embodied agents that perceive through visual sensors, reason about 3D spatial situations, and execute multi-phase tasks over extended horizons. We present SpaceMind, a modular and self-evolving vision-language model (VLM) agent framework that decomposes knowledge, tools, and reasoning into three independently extensible dimensions: skill modules with dynamic routing, Model Context Protocol (MCP) tools with configurable profiles, and injectable reasoning-mode skills. An MCP-Redis interface layer enables the same codebase to operate across simulation and physical hardware without modification, and a Skill Self-Evolution mechanism distills operational experience into persistent skill files without model fine-tuning. We validate SpaceMind through 192 closed-loop runs across five satellites, three task types, and two environments, a UE5 simulation and a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wuaodi/SpaceMind
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.