Decision Making in Non-Stationary Environments with Policy-Augmented   Monte Carlo Tree Search

Geoffrey Pettet; Ayan Mukhopadhyay; Abhishek Dubey

arXiv:2202.13003·cs.AI·March 1, 2022

Decision Making in Non-Stationary Environments with Policy-Augmented Monte Carlo Tree Search

Geoffrey Pettet, Ayan Mukhopadhyay, Abhishek Dubey

PDF

Open Access

TL;DR

This paper introduces Policy Augmented MCTS (PA-MCTS), a hybrid decision-making method that combines reinforcement learning and Monte Carlo Tree Search to adapt efficiently to changing environments, outperforming pure approaches in non-stationary settings.

Contribution

The paper proposes PA-MCTS, a novel hybrid approach that integrates policy estimates into MCTS to improve convergence speed and decision quality in non-stationary environments.

Findings

01

PA-MCTS achieves higher rewards than standalone policy in non-stationary CartPole.

02

PA-MCTS converges faster than pure MCTS in dynamic environments.

03

The hybrid approach effectively adapts to environmental shifts.

Abstract

Decision-making under uncertainty (DMU) is present in many important problems. An open challenge is DMU in non-stationary environments, where the dynamics of the environment can change over time. Reinforcement Learning (RL), a popular approach for DMU problems, learns a policy by interacting with a model of the environment offline. Unfortunately, if the environment changes the policy can become stale and take sub-optimal actions, and relearning the policy for the updated environment takes time and computational effort. An alternative is online planning approaches such as Monte Carlo Tree Search (MCTS), which perform their computation at decision time. Given the current environment, MCTS plans using high-fidelity models to determine promising action trajectories. These models can be updated as soon as environmental changes are detected to immediately incorporate them into decision…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComplex Systems and Decision Making · Reinforcement Learning in Robotics