TL;DR
This paper introduces a cost-effective two-stage framework that uses cross-task examples and graph-based label propagation to generate in-task prompts for large language models, reducing the need for expensive data labeling.
Contribution
The proposed method combines cross-task prompting with graph-based label propagation to efficiently create in-task demonstrations without extensive LLM queries.
Findings
Achieves strong task performance with reduced labeling costs
Effectively propagates labels to unlabeled data using graph methods
Reduces reliance on large language models for data annotation
Abstract
The capability of in-context learning (ICL) enables large language models (LLMs) to perform novel tasks without parameter updates by conditioning on a few input-output examples. However, collecting high-quality examples for new or challenging tasks can be costly and labor-intensive. In this work, we propose a cost-efficient two-stage pipeline that reduces reliance on LLMs for data labeling. Our approach first leverages readily available cross-task examples to prompt an LLM and pseudo-label a small set of target task instances. We then introduce a graph-based label propagation method that spreads label information to the remaining target examples without additional LLM queries. The resulting fully pseudo-labeled dataset is used to construct in-task demonstrations for ICL. This pipeline combines the flexibility of cross-task supervision with the scalability of LLM-free propagation.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
