Enhancing reasoning accuracy in large language models during inference time

Vinay Sharma; Manish Jain

arXiv:2603.21301·cs.CL·March 24, 2026

Enhancing reasoning accuracy in large language models during inference time

Vinay Sharma, Manish Jain

PDF

Open Access

TL;DR

This paper investigates inference-time techniques like self-consistency, dual-model agreement, and self-reflection to improve reasoning accuracy of large language models, finding that stochastic decoding significantly enhances performance with minimal overhead.

Contribution

The study provides a systematic comparison of three inference-time strategies for improving LLM reasoning accuracy using Chain-of-Thought prompting.

Findings

01

Self-consistency with nucleus sampling improves accuracy by 9-15%.

02

Dual-model agreement enhances reliability in moderate-risk scenarios.

03

Self-reflection offers limited gains for smaller models.

Abstract

Large Language Models (LLMs) often exhibit strong linguistic abilities while remaining unreliable on multi-step reasoning tasks, particularly when deployed without additional training or fine-tuning. In this work, we study inference-time techniques to improve the reasoning accuracy of LLMs. We systematically evaluate three classes of inference-time strategies: (i) self-consistency via stochastic decoding, where the model is sampled multiple times using controlled temperature and nucleus sampling and the most frequent final answer is selected; (ii) dual-model reasoning agreement, where outputs from two independent models are compared and only consistent reasoning traces are trusted; and (iii) self-reflection, where the model critiques and revises its own reasoning. Across all evaluated methods, we employ Chain-of-Thought (CoT) [1] prompting to elicit explicit intermediate reasoning steps…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications