Chain of Thoughtlessness? An Analysis of CoT in Planning

Kaya Stechly; Karthik Valmeekam; Subbarao Kambhampati

arXiv:2405.04776·cs.AI·March 13, 2025·5 cites

Chain of Thoughtlessness? An Analysis of CoT in Planning

Kaya Stechly, Karthik Valmeekam, Subbarao Kambhampati

PDF

Open Access 2 Videos

TL;DR

This study critically examines the effectiveness of chain of thought prompting in large language models for classical planning problems, revealing that its benefits are limited to highly specific prompts and do not reflect learning general algorithms.

Contribution

The paper provides a detailed case study showing that chain of thought prompts require extensive problem-specific engineering and do not facilitate general algorithm learning in LLMs.

Findings

01

Performance gains depend on highly specific prompts

02

Improvements diminish as query complexity increases

03

Failure modes are consistent across domains

Abstract

Large language model (LLM) performance on reasoning problems typically does not generalize out of distribution. Previous work has claimed that this can be mitigated with chain of thought prompting-a method of demonstrating solution procedures-with the intuition that it is possible to in-context teach an LLM an algorithm for solving the problem. This paper presents a case study of chain of thought on problems from Blocksworld, a classical planning domain, and examines the performance of two state-of-the-art LLMs across two axes: generality of examples given in prompt, and complexity of problems queried with each prompt. While our problems are very simple, we only find meaningful performance improvements from chain of thought prompts when those prompts are exceedingly specific to their problem class, and that those improvements quickly deteriorate as the size n of the query-specified…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

AGI in 5 Years? Ben Goertzel on Superintelligence· youtube

Chain of Thoughtlessness? An Analysis of CoT in Planning· slideslive

Taxonomy

TopicsEducational Tools and Methods · Innovative Teaching and Learning Methods

MethodsHierarchical Information Threading