Latent Self-Consistency for Reliable Majority-Set Selection in Short- and Long-Answer Reasoning

Jungsuk Oh; Jay-Yoon Lee

arXiv:2508.18395·cs.CL·March 2, 2026

Latent Self-Consistency for Reliable Majority-Set Selection in Short- and Long-Answer Reasoning

Jungsuk Oh, Jay-Yoon Lee

PDF

TL;DR

The paper introduces Latent Self-Consistency (LSC), a novel method that improves the reliability of answer selection in large language models across short- and long-form questions by using learnable token embeddings with minimal computational overhead.

Contribution

LSC is a new, lightweight approach that enhances consistency-based answer selection in LLMs without altering model architecture or significantly increasing inference time.

Findings

01

LSC outperforms existing methods like SC, USC, and WUCS on multiple benchmarks.

02

LSC maintains low calibration error across answer formats.

03

LSC adds less than 1% runtime overhead during inference.

Abstract

Probabilistic decoding in Large Language Models (LLMs) often yields inconsistent outputs, particularly on complex or long-form questions. Self-Consistency (SC) mitigates this for short-form QA by majority voting over exact strings, whereas Universal Self-Consistency (USC) and Weighted Unigram Consistency Score (WUCS) extend to long-form responses but lose accuracy on short-form benchmarks. We introduce \textbf{Latent Self-Consistency (LSC)}, which selects the most semantically consistent response using learnable token embeddings. LSC's lightweight forward processing of summary tokens only introduces negligible runtime overhead (at most $0.9%$ ) on top of standard decoding of the base LLM, and requires no changes to the model architecture. Across 6 short-form and 5 long-form reasoning benchmarks (e.g., MATH, MMLU, TruthfulQA), LSC surpasses SC, USC, and WUCS on both short-form and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.