Beyond Sample-Level Feedback: Using Reference-Level Feedback to Guide Data Synthesis

Shuhaib Mehri; Xiusi Chen; Heng Ji; Dilek Hakkani-T\"ur

arXiv:2502.04511·cs.CL·October 14, 2025

Beyond Sample-Level Feedback: Using Reference-Level Feedback to Guide Data Synthesis

Shuhaib Mehri, Xiusi Chen, Heng Ji, Dilek Hakkani-T\"ur

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces Reference-Level Feedback, a novel method for guiding synthetic data generation for instruction tuning, resulting in higher quality datasets and improved model performance.

Contribution

The paper proposes Reference-Level Feedback to enhance synthetic data quality by leveraging reference samples, surpassing traditional sample-level feedback methods.

Findings

01

Synthesized REFED dataset with 10K instruction-response pairs.

02

Fine-tuned models achieved state-of-the-art performance on AlpacaEval 2.0.

03

Reference-Level Feedback outperforms traditional methods and generalizes across models.

Abstract

High-quality instruction-tuning data is crucial for developing Large Language Models (LLMs) that can effectively navigate real-world tasks and follow human instructions. While synthetic data generation offers a scalable approach for creating such datasets, it imposes a quality ceiling where models trained on the data cannot outperform the LLM generating it. To overcome this limitation, we introduce Reference-Level Feedback, a paradigm that extracts desirable characteristics from carefully curated reference samples to guide the synthesis of higher-quality instruction-response pairs. Using this approach, we synthesize REFED, a dataset of 10K instruction-response pairs. Fine-tuning Llama-3.1-8B-Instruct and Mistral-7B-Instruct on REFED demonstrate state-of-the-art performance among similarly sized models, notably reaching a 43.96\% length-controlled win-rate on AlpacaEval 2.0. Extensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Shuhaibm/refed
noneOfficial

Videos

Beyond Sample-Level Feedback: Using Reference-Level Feedback to Guide Data Synthesis· underline

Taxonomy

TopicsExperimental Learning in Engineering · Numerical Methods and Algorithms · Intelligent Tutoring Systems and Adaptive Learning