Navigating the Prompt Space: Improving LLM Classification of Social Science Texts Through Prompt Engineering
Erkan Gunes, Christoffer Florczak, Tevfik Murat Yildirim

TL;DR
This paper investigates how prompt engineering—specifically label descriptions, instructions, and few-shot examples—can enhance the accuracy of LLMs in social science text classification, highlighting the importance of tailored prompts and validation.
Contribution
It systematically examines the impact of prompt context variations on LLM classification performance, revealing that minimal prompt adjustments can significantly improve accuracy.
Findings
Small increases in prompt context boost performance
Further increases yield diminishing returns or decrease accuracy
Model and task heterogeneity require individual validation
Abstract
Recent developments in text classification using Large Language Models (LLMs) in the social sciences suggest that costs can be cut significantly, while performance can sometimes rival existing computational methods. However, with a wide variance in performance in current tests, we move to the question of how to maximize performance. In this paper, we focus on prompt context as a possible avenue for increasing accuracy by systematically varying three aspects of prompt engineering: label descriptions, instructional nudges, and few shot examples. Across two different examples, our tests illustrate that a minimal increase in prompt context yields the highest increase in performance, while further increases in context only tend to yield marginal performance increases thereafter. Alarmingly, increasing prompt context sometimes decreases accuracy. Furthermore, our tests suggest substantial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Text and Document Classification Technologies · Topic Modeling
