SynQuE: Estimating Synthetic Dataset Quality Without Annotations

Arthur Chen; Victor Zhong

arXiv:2511.03928·cs.LG·May 4, 2026

SynQuE: Estimating Synthetic Dataset Quality Without Annotations

Arthur Chen, Victor Zhong

PDF

1 Repo

TL;DR

SynQuE introduces a framework to estimate the quality of synthetic datasets for real-world tasks without annotations, using proxy metrics and large language models, to improve data selection under scarcity.

Contribution

It formalizes the SynQuE problem, introduces new proxy metrics including LENS, and demonstrates their effectiveness across diverse tasks for synthetic data ranking.

Findings

01

LENS outperforms other proxies on complex tasks.

02

Synthetic data selection improves task accuracy significantly.

03

Proxies correlate well with real task performance.

Abstract

We introduce and formalize the Synthetic Dataset Quality Estimation (SynQuE) problem: ranking synthetic datasets by their expected real-world task performance using only limited unannotated real data. This addresses a critical and open challenge where data is scarce due to collection costs or privacy constraints. We establish the first comprehensive benchmarks for this problem by introducing and evaluating proxy metrics that choose synthetic data for training to maximize task performance on real data. We introduce the first proxy metrics for SynQuE by adapting distribution and diversity-based distance measures to our context via embedding models. To address the shortcomings of these metrics on complex planning tasks, we propose LENS, a novel proxy that leverages large language model reasoning. Our results show that SynQuE proxies correlate with real task performance across diverse…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

null
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.