Know When To Fold 'Em: Token-Efficient LLM Synthetic Data Generation via Multi-Stage In-Flight Rejection

Anjir Ahmed Chowdhury; Syed Zawad; Feng Yan

arXiv:2605.14062·cs.AI·May 15, 2026

Know When To Fold 'Em: Token-Efficient LLM Synthetic Data Generation via Multi-Stage In-Flight Rejection

Anjir Ahmed Chowdhury, Syed Zawad, Feng Yan

PDF

TL;DR

This paper introduces MSIFR, a lightweight, training-free framework that reduces token waste in LLM synthetic data generation by early rejection of low-quality outputs through multi-stage, rule-based checks.

Contribution

MSIFR is a novel multi-stage in-flight rejection method that significantly cuts token consumption during synthetic data generation without requiring retraining.

Findings

01

Reduces token consumption by 11%-77% across models and benchmarks.

02

Combining MSIFR with early-exit methods achieves up to 78.2% token savings.

03

Maintains or improves evaluation accuracy despite reduced token usage.

Abstract

While synthetic data generation with large language models (LLMs) is widely used in post-training pipelines, existing approaches typically generate full outputs before applying quality filters, leading to substantial token waste on samples that are ultimately discarded. To address this, we propose Multi-Stage In-Flight Rejection (MSIFR), a lightweight, training-free framework that detects and terminates low-quality generation trajectories at intermediate checkpoints before they reach full completion. MSIFR decomposes the generation process into sequential stages and applies fast rule-based validators to identify arithmetic inconsistencies, hallucination patterns, and formatting violations, enabling early rejection of faulty samples. We formalize in-flight rejection as a sequential decision process and show that any non-trivial discard policy reduces expected token consumption, with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.