Enhancing reasoning accuracy in large language models during inference time
Vinay Sharma, Manish Jain

TL;DR
This paper investigates inference-time techniques like self-consistency, dual-model agreement, and self-reflection to improve reasoning accuracy of large language models, finding that stochastic decoding significantly enhances performance with minimal overhead.
Contribution
The study provides a systematic comparison of three inference-time strategies for improving LLM reasoning accuracy using Chain-of-Thought prompting.
Findings
Self-consistency with nucleus sampling improves accuracy by 9-15%.
Dual-model agreement enhances reliability in moderate-risk scenarios.
Self-reflection offers limited gains for smaller models.
Abstract
Large Language Models (LLMs) often exhibit strong linguistic abilities while remaining unreliable on multi-step reasoning tasks, particularly when deployed without additional training or fine-tuning. In this work, we study inference-time techniques to improve the reasoning accuracy of LLMs. We systematically evaluate three classes of inference-time strategies: (i) self-consistency via stochastic decoding, where the model is sampled multiple times using controlled temperature and nucleus sampling and the most frequent final answer is selected; (ii) dual-model reasoning agreement, where outputs from two independent models are compared and only consistent reasoning traces are trusted; and (iii) self-reflection, where the model critiques and revises its own reasoning. Across all evaluated methods, we employ Chain-of-Thought (CoT) [1] prompting to elicit explicit intermediate reasoning steps…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
