The Prompt is Mightier than the Example
Shengzhe Xu, Nikhil Muralidhar, Naren Ramakrishnan

TL;DR
This paper investigates how explicit domain knowledge in prompts can reduce the need for numerous in-context examples in large language models, offering a scalable alternative for high-quality synthetic data generation.
Contribution
It introduces Knowledge-Guided Prompting (KGP), a novel method that incorporates domain knowledge into prompts to offset the reliance on many in-context examples in LLMs.
Findings
KGP can effectively substitute for multiple ICL examples.
An empirical scaling law quantifies the trade-off between knowledge and examples.
KGP enhances synthetic data quality with fewer in-context examples.
Abstract
Numerous recent prompt optimization approaches like chain-of-thought, have been demonstrated to significantly improve the quality of content generated by large language models (LLMs). In-context learning (ICL), a recent paradigm where a few representative examples guide content generation has also led to strong improvements in generation quality of LLM generated content. This idea has been applied to great effect in synthetic tabular data generation, where LLMs, through effective use of ICL and prompt optimization, can generate data that approximate samples from complex, heterogeneous distributions based on representative examples. However, ensuring high-fidelity synthetic data often requires a very large number of ICL examples which may be unavailable or costly to obtain. At the same time, as LLMs get larger and larger, their in-built prior knowledge becomes vast and can potentially…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
