Optimal Self-Consistency for Efficient Reasoning with Large Language Models
Austin Feng, Marius Alonso, Ambroise Odonnat

TL;DR
This paper analyzes the scaling behavior of self-consistency in large language models, introduces a new dynamic sampling method called Blend-ASC, and demonstrates significant improvements in sample efficiency for reasoning tasks.
Contribution
It provides the first theoretical analysis of self-consistency scaling, introduces Blend-ASC for dynamic sample allocation, and achieves state-of-the-art efficiency in LLM reasoning.
Findings
Power law scaling for self-consistency across datasets
Blend-ASC reduces sample usage by 6.8x on average
Blend-ASC outperforms fixed- and dynamic-allocation baselines
Abstract
Self-consistency (SC) is a widely used test-time inference technique for improving performance in chain-of-thought reasoning. It involves generating multiple responses, or samples from a large language model (LLM) and selecting the most frequent answer. This procedure can naturally be viewed as a majority vote or empirical mode estimation. Despite its effectiveness, SC is prohibitively expensive at scale when naively applied to datasets, and it lacks a unified theoretical treatment of sample efficiency and scaling behavior. In this paper, we provide the first comprehensive analysis of SC's scaling behavior and its variants, drawing on mode estimation and voting theory. We derive and empirically validate power law scaling for self-consistency across datasets, and analyze the sample efficiency for fixed-allocation and dynamic-allocation sampling schemes. From these insights, we introduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques
