Know When To Fold 'Em: Token-Efficient LLM Synthetic Data Generation via Multi-Stage In-Flight Rejection
Anjir Ahmed Chowdhury, Syed Zawad, Feng Yan

TL;DR
This paper introduces MSIFR, a lightweight, training-free framework that reduces token waste in LLM synthetic data generation by early rejection of low-quality outputs through multi-stage, rule-based checks.
Contribution
MSIFR is a novel multi-stage in-flight rejection method that significantly cuts token consumption during synthetic data generation without requiring retraining.
Findings
Reduces token consumption by 11%-77% across models and benchmarks.
Combining MSIFR with early-exit methods achieves up to 78.2% token savings.
Maintains or improves evaluation accuracy despite reduced token usage.
Abstract
While synthetic data generation with large language models (LLMs) is widely used in post-training pipelines, existing approaches typically generate full outputs before applying quality filters, leading to substantial token waste on samples that are ultimately discarded. To address this, we propose Multi-Stage In-Flight Rejection (MSIFR), a lightweight, training-free framework that detects and terminates low-quality generation trajectories at intermediate checkpoints before they reach full completion. MSIFR decomposes the generation process into sequential stages and applies fast rule-based validators to identify arithmetic inconsistencies, hallucination patterns, and formatting violations, enabling early rejection of faulty samples. We formalize in-flight rejection as a sequential decision process and show that any non-trivial discard policy reduces expected token consumption, with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
