Input-Time Scaling: Adding Noise and Irrelevance into Less-Is-More Drastically Improves Reasoning Performance and Efficiency

Rapheal Huang (Yuming); Weilong Guo

arXiv:2508.13654·cs.LG·April 22, 2026

Input-Time Scaling: Adding Noise and Irrelevance into Less-Is-More Drastically Improves Reasoning Performance and Efficiency

Rapheal Huang (Yuming), Weilong Guo

PDF

TL;DR

This paper introduces Input-Time Scaling, a method that adds noise and irrelevance to data during training and inference, improving reasoning performance and efficiency in large language models without extensive data curation.

Contribution

It systematically relaxes data quality constraints by adding controlled noise, demonstrating that mixing relevant and irrelevant contexts enhances reasoning and efficiency, and introduces Input-Time Scaling for practical benefits.

Findings

01

Adding noisy and irrelevant contexts improves reasoning efficiency.

02

Mixing relevant and irrelevant data yields optimal results.

03

Input-Time Scaling achieves state-of-the-art performance on reasoning benchmarks.

Abstract

Large Language Models (LLMs) excel at reasoning, traditionally requiring high-quality large-scale data and extensive training. Recent works reveal a very appealing Less-Is-More phenomenon where very small, carefully curated high-quality datasets match resource-intensive approaches. In this work, we further systematically relax their quality constraints by adding controlled noise via persona context relevance and comparing datasets of different qualities. Counterintuitively, we find that mixing relevant and irrelevant contexts consistently across training and inference stages yields optimal results -- a phenomenon we term training-testing co-design. Dataset quality comparisons show that high-quality data benefits weaker models on easy questions, while low-quality data achieves higher scores on hard questions with capable models. Across our experiments, reasoning performance is linked to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.