Multi-Agent Reinforcement Learning for Autonomous Multi-Satellite Earth Observation: A Realistic Case Study

Mohamad A. Hady; Siyi Hu; Mahardhika Pratama; Jimmy Cao; Ryszard Kowalczyk

arXiv:2506.15207·cs.AI·November 6, 2025

Multi-Agent Reinforcement Learning for Autonomous Multi-Satellite Earth Observation: A Realistic Case Study

Mohamad A. Hady, Siyi Hu, Mahardhika Pratama, Jimmy Cao, Ryszard Kowalczyk

PDF

Open Access

TL;DR

This paper explores the use of Multi-Agent Reinforcement Learning to enable autonomous coordination of satellite constellations for Earth Observation, demonstrating effective resource management and decision-making in a realistic simulation environment.

Contribution

It introduces a MARL framework for multi-satellite EO mission planning, addressing decentralised coordination, partial observability, and resource constraints with empirical evaluation.

Findings

01

MARL algorithms effectively coordinate multi-satellite tasks

02

Training stability varies across algorithms like PPO and MAPPO

03

MARL approaches improve resource management in EO missions

Abstract

The exponential growth of Low Earth Orbit (LEO) satellites has revolutionised Earth Observation (EO) missions, addressing challenges in climate monitoring, disaster management, and more. However, autonomous coordination in multi-satellite systems remains a fundamental challenge. Traditional optimisation approaches struggle to handle the real-time decision-making demands of dynamic EO missions, necessitating the use of Reinforcement Learning (RL) and Multi-Agent Reinforcement Learning (MARL). In this paper, we investigate RL-based autonomous EO mission planning by modelling single-satellite operations and extending to multi-satellite constellations using MARL frameworks. We address key challenges, including energy and data storage limitations, uncertainties in satellite observations, and the complexities of decentralised coordination under partial observability. By leveraging a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDistributed and Parallel Computing Systems

MethodsProximal Policy Optimization