Feedback-Induced Performance Decline in LLM-Based Decision-Making
Xiao Yang, Juxi Leitner, Michael Burke

TL;DR
This paper investigates how Large Language Models perform in decision-making tasks within Markov Decision Processes, revealing that feedback mechanisms can cause performance decline in complex environments, highlighting the need for hybrid strategies.
Contribution
It provides a comparative analysis of LLM-based decision-making versus classical RL, identifying feedback-induced performance decline and suggesting directions for improvement.
Findings
LLMs perform well initially in simple tasks
Feedback mechanisms can cause confusion in complex scenarios
Hybrid strategies may enhance LLM decision-making
Abstract
The ability of Large Language Models (LLMs) to extract context from natural language problem descriptions naturally raises questions about their suitability in autonomous decision-making settings. This paper studies the behaviour of these models within a Markov Decision Process (MDPs). While traditional reinforcement learning (RL) strategies commonly employed in this setting rely on iterative exploration, LLMs, pre-trained on diverse datasets, offer the capability to leverage prior knowledge for faster adaptation. We investigate online structured prompting strategies in sequential decision making tasks, comparing the zero-shot performance of LLM-based approaches to that of classical RL methods. Our findings reveal that although LLMs demonstrate improved initial performance in simpler environments, they struggle with planning and reasoning in complex scenarios without fine-tuning or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
