Synthesize, Partition, then Adapt: Eliciting Diverse Samples from Foundation Models
Yeming Wen, Swarat Chaudhuri

TL;DR
The paper introduces SPA, a framework that synthesizes, partitions, and adapts foundation models to generate diverse, high-quality responses by leveraging synthetic data and data attribution methods, improving user experience across multiple domains.
Contribution
SPA is a novel framework that combines data synthesis, partitioning, and model adaptation to elicit diverse responses from foundation models without sacrificing quality.
Findings
SPA effectively diversifies responses in code generation and NLP tasks.
The approach maintains high response quality while increasing diversity.
Experimental results outperform baseline methods in response diversity.
Abstract
Presenting users with diverse responses from foundation models is crucial for enhancing user experience and accommodating varying preferences. However, generating multiple high-quality and diverse responses without sacrificing accuracy remains a challenge, especially when using greedy sampling. In this work, we propose a novel framework, Synthesize-Partition-Adapt (SPA), that leverages the abundant synthetic data available in many domains to elicit diverse responses from foundation models. By leveraging signal provided by data attribution methods such as influence functions, SPA partitions data into subsets, each targeting unique aspects of the data, and trains multiple model adaptations optimized for these subsets. Experimental results demonstrate the effectiveness of our approach in diversifying foundation model responses while maintaining high quality, showcased through the HumanEval…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsKarst Systems and Hydrogeology
