From Noise to Diversity: Random Embedding Injection in LLM Reasoning
Heejun Kim, Seungpil Lee, Jewon Yeom, Jaewon Sok, Seonghyeon Park, Jeongjae Park, Taesup Kim, Sundong Kim

TL;DR
This paper introduces Random Soft Prompts (RSPs), which are untrained, randomly generated embeddings that improve reasoning accuracy and diversity in large language models, both during inference and training.
Contribution
RSPs provide a training-free, simple method to enhance reasoning and diversity in LLMs, isolating the structural effect of prompt injection without learned content.
Findings
RSPs achieve comparable accuracy to trained prompts on math reasoning benchmarks.
RSPs increase early-stage token diversity and widen Pass@N during inference.
Applying RSPs during training yields practical performance gains.
Abstract
Recent soft prompt research has tried to improve reasoning by inserting trained vectors into LLM inputs, yet whether the gain comes from the learned content or from the act of injection itself has not been carefully separated. We study Random Soft Prompts (RSPs), which drop the training step entirely and append a freshly drawn sequence of random embedding vectors to the input. Each RSP vector is sampled from an isotropic Gaussian fitted to the entrywise mean and variance of the pretrained embedding table; the sequence carries no learned content, and yet reaches accuracy comparable to optimized soft prompts on math reasoning benchmarks in several settings. The mechanism unfolds in two stages: because attention has to absorb a never-seen-before random position, the distribution over the first few generated tokens flattens and reasoning trajectories branch, and as generation continues this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
