Hybrid Belief Reinforcement Learning for Efficient Coordinated Spatial Exploration
Danish Rizvi, David Boyle

TL;DR
This paper introduces a hybrid belief-reinforcement learning framework for multi-agent spatial exploration, combining model-based spatial belief estimation with deep RL for efficient, cooperative coverage and faster convergence.
Contribution
It proposes a novel hybrid approach that integrates spatial belief modeling with RL, enabling efficient, cooperative exploration with transfer learning techniques.
Findings
Achieved 10.8% higher cumulative reward over baselines.
Faster convergence by 38% compared to existing methods.
Dual-channel transfer improves exploration efficiency.
Abstract
Coordinating multiple autonomous agents to explore and serve spatially heterogeneous demand requires jointly learning unknown spatial patterns and planning trajectories that maximize task performance. Pure model-based approaches provide structured uncertainty estimates but lack adaptive policy learning, while deep reinforcement learning often suffers from poor sample efficiency when spatial priors are absent. This paper presents a hybrid belief-reinforcement learning (HBRL) framework to address this gap. In the first phase, agents construct spatial beliefs using a Log-Gaussian Cox Process (LGCP) and execute information-driven trajectories guided by a Pathwise Mutual Information (PathMI) planner with multi-step lookahead. In the second phase, trajectory control is transferred to a Soft Actor-Critic (SAC) agent, warm-started through dual-channel knowledge transfer: belief state…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsUAV Applications and Optimization · Age of Information Optimization · Reinforcement Learning in Robotics
