Can We Rely on LLM Agents to Draft Long-Horizon Plans? Let's Take TravelPlanner as an Example
Yanan Chen, Ali Pesaranghader, Tanmana Sadhu, Dong Hoon Yi

TL;DR
This study evaluates the robustness and limitations of LLM-based agents in complex, long-horizon planning tasks using the TravelPlanner benchmark, highlighting challenges and proposing a feedback-aware fine-tuning method for improvement.
Contribution
The paper introduces a comprehensive analysis of LLM agents in real-world planning, identifying key failure modes and proposing Feedback-Aware Fine-Tuning (FAFT) to enhance performance.
Findings
LLMs struggle to focus on crucial parts of long contexts.
They have difficulty analyzing and providing feedback on long plans.
FAFT significantly improves LLM performance over standard fine-tuning.
Abstract
Large language models (LLMs) have brought autonomous agents closer to artificial general intelligence (AGI) due to their promising generalization and emergent capabilities. There is, however, a lack of studies on how LLM-based agents behave, why they could potentially fail, and how to improve them, particularly in demanding real-world planning tasks. In this paper, as an effort to fill the gap, we present our study using a realistic benchmark, TravelPlanner, where an agent must meet multiple constraints to generate accurate plans. We leverage this benchmark to address four key research questions: (1) are LLM agents robust enough to lengthy and noisy contexts when it comes to reasoning and planning? (2) can few-shot prompting adversely impact the performance of LLM agents in scenarios with long context? (3) can we rely on refinement to improve plans, and (4) can fine-tuning LLMs with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEuropean and International Contract Law · Corporate Governance and Law · Conflict of Laws and Jurisdiction
