Less Back-and-Forth: A Comparative Study of Structured Prompting
Saurav Ghosh, Gabriella Polach, and Abdou Sow

TL;DR
This study demonstrates that structured prompts, especially checklist-based ones, enhance large language model response quality and efficiency across various tasks, reducing user effort and interaction.
Contribution
It introduces and empirically evaluates structured prompt designs, showing their effectiveness in improving LLM output quality and reducing interaction compared to unstructured prompts.
Findings
Checklist prompts scored highest on quality rubrics.
Checklist prompts used fewer tokens, indicating efficiency.
Structured prompts improve response quality across multiple tasks.
Abstract
Large language models (LLMs) are widely used for open-ended tasks, but underspecified prompts can lead to low-quality answers and additional interaction. This paper studies whether structured prompt design improves response quality while reducing user effort. We compare three prompt conditions: a raw prompt, a checklist-improved prompt, and a clarifying-question prompt. We evaluate these conditions across four task types--summarization, planning, explanation, and coding--using three LLM systems: ChatGPT, Claude, and Grok. Each output is scored with a unified rubric covering task completion, correctness, compliance, and clarity. Checklist-improved prompts achieved the highest mean rubric score, 7.50 out of 8, compared with 5.67 for raw prompts and 6.67 for clarifying-question prompts. Checklist prompts also produced the best quality-effort tradeoff, using fewer average tokens than both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
