Decision Making in Non-Stationary Environments with Policy-Augmented Monte Carlo Tree Search
Geoffrey Pettet, Ayan Mukhopadhyay, Abhishek Dubey

TL;DR
This paper introduces Policy Augmented MCTS (PA-MCTS), a hybrid decision-making method that combines reinforcement learning and Monte Carlo Tree Search to adapt efficiently to changing environments, outperforming pure approaches in non-stationary settings.
Contribution
The paper proposes PA-MCTS, a novel hybrid approach that integrates policy estimates into MCTS to improve convergence speed and decision quality in non-stationary environments.
Findings
PA-MCTS achieves higher rewards than standalone policy in non-stationary CartPole.
PA-MCTS converges faster than pure MCTS in dynamic environments.
The hybrid approach effectively adapts to environmental shifts.
Abstract
Decision-making under uncertainty (DMU) is present in many important problems. An open challenge is DMU in non-stationary environments, where the dynamics of the environment can change over time. Reinforcement Learning (RL), a popular approach for DMU problems, learns a policy by interacting with a model of the environment offline. Unfortunately, if the environment changes the policy can become stale and take sub-optimal actions, and relearning the policy for the updated environment takes time and computational effort. An alternative is online planning approaches such as Monte Carlo Tree Search (MCTS), which perform their computation at decision time. Given the current environment, MCTS plans using high-fidelity models to determine promising action trajectories. These models can be updated as soon as environmental changes are detected to immediately incorporate them into decision…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Systems and Decision Making · Reinforcement Learning in Robotics
