Thinking by Subtraction: Confidence-Driven Contrastive Decoding for LLM Reasoning

Lexiang Tang; Weihao Gao; Bingchen Zhao; Lu Ma; Qiao jin; Bang Yang; Yuexian Zou

arXiv:2602.18232·cs.CL·February 23, 2026

Thinking by Subtraction: Confidence-Driven Contrastive Decoding for LLM Reasoning

Lexiang Tang, Weihao Gao, Bingchen Zhao, Lu Ma, Qiao jin, Bang Yang, Yuexian Zou

PDF

Open Access

TL;DR

This paper introduces Confidence-Driven Contrastive Decoding, a novel test-time method that improves large language model reasoning accuracy by selectively intervening on low-confidence tokens, reducing errors and output length without extra training.

Contribution

It proposes a training-free, confidence-driven contrastive decoding approach that targets low-confidence tokens for improved reasoning in large language models.

Findings

01

Significantly improves accuracy on mathematical reasoning benchmarks.

02

Reduces output length and computational overhead.

03

Enhances reasoning reliability without additional training.

Abstract

Recent work on test-time scaling for large language model (LLM) reasoning typically assumes that allocating more inference-time computation uniformly improves correctness. However, prior studies show that reasoning uncertainty is highly localized: a small subset of low-confidence tokens disproportionately contributes to reasoning errors and unnecessary output expansion. Motivated by this observation, we propose Thinking by Subtraction, a confidence-driven contrastive decoding approach that improves reasoning reliability through targeted token-level intervention. Our method, Confidence-Driven Contrastive Decoding, detects low-confidence tokens during decoding and intervenes selectively at these positions. It constructs a contrastive reference by replacing high-confidence tokens with minimal placeholders, and refines predictions by subtracting this reference distribution at low-confidence…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning