Selecting Belief-State Approximations in Simulators with Latent States
Nan Jiang

TL;DR
This paper addresses the challenge of selecting belief-state approximations in simulators with latent states, proposing new algorithms and analyses for effective belief-state sampling under limited access, with implications for planning and calibration.
Contribution
It introduces a reduction of the belief-state selection problem to a conditional distribution-selection task and develops algorithms with theoretical guarantees for sampling-only access scenarios.
Findings
Observation-based selection may fail under Single-Reset rollouts.
Latent state-based selection provides guarantees under Repeated-Reset.
The paper discusses the impact of distribution shift and sampling policies.
Abstract
State resetting is a fundamental but often overlooked capability of simulators. It supports sample-based planning by allowing resets to previously encountered simulation states, and enables calibration of simulators using real data by resetting to states observed in real-system traces. While often taken for granted, state resetting in complex simulators can be nontrivial: when the simulator comes with latent variables (states), state resetting requires sampling from the posterior over the latent state given the observable history, a.k.a. the belief state (Silver and Veness, 2010). While exact sampling is often infeasible, many approximate belief-state samplers can be constructed, raising the question of how to select among them using only sampling access to the simulator. In this paper, we show that this problem reduces to a general conditional distribution-selection task and develop…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI-based Problem Solving and Planning · Robotic Path Planning Algorithms · Reinforcement Learning in Robotics
