SoftSRV: Learn to Generate Targeted Synthetic Data
Giulia DeSalvo, Jean-Fracois Kagy, Lazaros Karydas, Afshin, Rostamizadeh, Sanjiv Kumar

TL;DR
SoftSRV is a new framework that uses data-driven loss minimization to generate targeted synthetic data for fine-tuning language models, outperforming prompt engineering methods across multiple domains.
Contribution
Introduces SoftSRV, a novel method for generating targeted synthetic data that improves task-specific model performance without domain-specific tuning.
Findings
SoftSRV outperforms prompt engineering in data quality and model performance.
Generated data from SoftSRV better matches target distributions according to MAUVE.
SoftSRV is effective across coding, math, and reasoning domains.
Abstract
We present a novel framework, SoftSRV, that is used to generate targeted synthetic fine-tuning data for improving task-specific model performance. Given a sample from a target distribution, our proposed framework uses a data-driven loss minimization approach to steer a frozen large language model (LLM) to generate synthetic sequences that are similar to those from the target distribution. SoftSRV provides a practical improvement over common prompt engineering approaches that rely on human-engineered prompt-templates, which can be idiosyncratic, labor-intensive to craft, and may need to be specialized per domain. We empirically evaluate our method against standard baselines guiding a large LLM to generate synthetic data to fine-tune a smaller language model on three different domains (coding, math, reasoning). We perform these evaluations without any particular specialization of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPlant Virus Research Studies · Animal Virus Infections Studies
