A High-Throughput Compute-Efficient POMDP Hide-And-Seek-Engine (HASE) for Multi-Agent Operations
Timothy Flavin, Sandip Sen

TL;DR
This paper introduces HASE, a highly efficient C++ engine for Dec-POMDPs that significantly accelerates multi-agent reinforcement learning by leveraging data-oriented design and GPU techniques.
Contribution
The paper presents a novel compute-efficient Dec-POMDP engine built in C++ that achieves unprecedented throughput and enables rapid training of multi-agent policies.
Findings
Achieves up to 33 million steps per second in single-agent scenarios.
Provides a 3,500× speedup over baseline implementations.
Successfully trains multi-agent policies in minutes.
Abstract
Reinforcement Learning (RL) algorithms exhibit high sample complexity, particularly when applied to Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs). As a response, projects such as SampleFactory, EnvPool, Brax, and IsaacLab migrate parallel execution of classic environments such as MuJoCo and Atari into C++ thread pools or the GPU to decrease the computational cost of environment steps. We are interested in optimizing the decision-level of human-AI joint operations, so we introduce a compute-efficient Dec-POMDP engine natively architected in C++ called Hide-And-Seek-Engine. By employing Data-Oriented Design (DOD) principles, explicit 64-byte cache-line alignment to remove false sharing, and a zero-copy PyTorch memory bridge using pinned memory and Direct Memory Access (DMA), our engine sustains throughput of up to 33,000,000 steps per second (SPS) in a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
