# PiCSAR: Probabilistic Confidence Selection And Ranking for Reasoning Chains

**Authors:** Joshua Ong Jun Leang, Zheng Zhao, Aryo Pradipta Gema, Sohee Yang, Wai-Chung Kwan, Xuanli He, Wenda Li, Pasquale Minervini, Eleonora Giunchiglia, Shay B. Cohen

arXiv: 2508.21787 · 2026-05-01

## TL;DR

PiCSAR is a training-free method that improves reasoning accuracy by scoring candidate solutions using joint log-likelihood, effectively identifying correct reasoning chains without ground-truth answers.

## Contribution

It introduces PiCSAR, a novel scoring approach based on joint log-likelihood that enhances reasoning model performance without additional training.

## Key findings

- PiCSAR achieves over 10-point improvements on benchmarks.
- It outperforms baselines with at least 2x fewer samples in most cases.
- Correct reasoning chains have higher confidence scores, validating PiCSAR's effectiveness.

## Abstract

Best-of-n sampling improves the accuracy of large language models (LLMs) and large reasoning models (LRMs) by generating multiple candidate solutions and selecting the one with the highest reward. The key challenge for reasoning tasks is designing a scoring function that can identify correct reasoning chains without access to ground-truth answers. We propose Probabilistic Confidence Selection And Ranking (PiCSAR): a simple, training-free method that scores each candidate generation using the joint log-likelihood of the reasoning and final answer. The joint log-likelihood of the reasoning and final answer naturally decomposes into reasoning confidence and answer confidence. PiCSAR achieves substantial gains across diverse benchmarks (+10.18 on MATH500, +9.81 on AIME2025), outperforming baselines with at least 2x fewer samples in 16 out of 20 comparisons. Our analysis reveals that correct reasoning chains exhibit significantly higher reasoning and answer confidence, justifying the effectiveness of PiCSAR.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2508.21787/full.md

## Figures

78 figures with captions in the complete paper: https://tomesphere.com/paper/2508.21787/full.md

## References

60 references — full list in the complete paper: https://tomesphere.com/paper/2508.21787/full.md

---
Source: https://tomesphere.com/paper/2508.21787