V-SYNTHESIS: Task-Agnostic Synthesis of Consistent and Diverse In-Context Demonstrations from Scratch via V-Entropy
Dingzirui Wang, Xuanliang Zhang, Keyan Xu, Qingfu Zhu, Wanxiang Che, Yang Deng

TL;DR
V-SYNTHESIS is a novel method for synthesizing consistent and diverse in-context demonstrations from scratch for any task, using a new metric called V-Score to guide the process, leading to improved performance.
Contribution
The paper introduces V-Synthesis, a task-agnostic demonstration synthesis method that employs V-Score for ensuring consistency and diversity, advancing beyond task-specific or pre-existing demonstration reliance.
Findings
V-Synthesis improves performance by 2.0% over existing methods.
V-Score outperforms gram-based and embedding-based metrics in efficiency and accuracy.
Demonstrates effective synthesis of demonstrations from scratch for arbitrary tasks.
Abstract
High labeling cost for in-context learning (ICL) demonstrations motivates using large language models (LLMs) for synthesis to reduce overhead. However, existing synthesis methods are mainly task-specific or rely on pre-existing demonstrations. So this paper focuses on synthesizing demonstrations from scratch for arbitrary tasks. A major challenge in synthesizing from scratch is ensuring consistency with the target task, as the lack of labeling guidance could lead to synthesis bias. We first propose a consistency metric called V-Score, which has higher performance and lower computation cost compared with the metrics based on grams or embedding vectors. Furthermore, we introduce V-Synthesis, which leverages V-Score for proportional sampling to ensure both high consistency and diversity of synthesized demonstrations. Experimental results demonstrate that V-Synthesis yields an average…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Topic Modeling · Natural Language Processing Techniques
