PlanU: Large Language Model Reasoning through Planning under Uncertainty
Ziwei Deng, Mian Deng, Chenjing Liang, Zeming Gao, Chennan Ma, Chenxing Lin, Haipeng Zhang, Songzhu Mei, Siqi Shen, Cheng Wang

TL;DR
PlanU is a novel LLM-based planning approach that incorporates uncertainty modeling through Monte Carlo Tree Search with quantile distributions, improving reasoning in stochastic environments.
Contribution
It introduces PlanU, a planning method that captures environmental and model uncertainty using quantile-based MCTS, advancing LLM reasoning under uncertainty.
Findings
PlanU outperforms existing methods in uncertain reasoning tasks.
Quantile distribution modeling improves uncertainty estimation.
UCC score effectively balances exploration and exploitation.
Abstract
Large Language Models (LLMs) are increasingly being explored across a range of reasoning tasks. However, LLMs sometimes struggle with reasoning tasks under uncertainty that are relatively easy for humans, such as planning actions in stochastic environments. The adoption of LLMs for reasoning is impeded by uncertainty challenges, such as LLM uncertainty and environmental uncertainty. LLM uncertainty arises from the stochastic sampling process inherent to LLMs. Most LLM-based Decision-Making (LDM) approaches address LLM uncertainty through multiple reasoning chains or search trees. However, these approaches overlook environmental uncertainty, which leads to poor performance in environments with stochastic state transitions. Some recent LDM approaches deal with uncertainty by forecasting the probability of unknown variables. However, they are not designed for multi-step reasoning tasks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsText Readability and Simplification · Multimodal Machine Learning Applications · Topic Modeling
