SkyNet: Belief-Aware Planning for Partially-Observable Stochastic Games
Adam Haile

TL;DR
SkyNet extends MuZero to partially observable stochastic games by incorporating belief-aware auxiliary heads, significantly improving performance in complex, uncertain environments without altering the core search algorithm.
Contribution
Introduces SkyNet, a belief-aware extension of MuZero that enhances latent representations for partial observability through auxiliary objectives, without changing the underlying search process.
Findings
SkyNet achieves a 75.3% peak win rate against the baseline in Skyjo.
SkyNet outperforms the baseline against heuristic opponents with a 0.720 win rate.
Belief-aware auxiliary supervision improves learned representations given sufficient data flow.
Abstract
In 2019, Google DeepMind released MuZero, a model-based reinforcement learning method that achieves strong results in perfect-information games by combining learned dynamics models with Monte Carlo Tree Search (MCTS). However, comparatively little work has extended MuZero to partially observable, stochastic, multi-player environments, where agents must act under uncertainty about hidden state. Such settings arise not only in card games but in domains such as autonomous negotiation, financial trading, and multi-agent robotics. In the absence of explicit belief modeling, MuZero's latent encoding has no dedicated mechanism for representing uncertainty over unobserved variables. To address this, we introduce SkyNet (Belief-Aware MuZero), which adds ego-conditioned auxiliary heads for winner prediction and rank estimation to the standard MuZero architecture. These objectives encourage the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
