Fine-grained Approaches for Confidence Calibration of LLMs in Automated Code Revision

Hong Yi Lin; Chunhua Liu; Haoyu Gao; Patanamon Thongtanunam; Christoph Treude

arXiv:2604.06723·cs.SE·April 9, 2026

Fine-grained Approaches for Confidence Calibration of LLMs in Automated Code Revision

Hong Yi Lin, Chunhua Liu, Haoyu Gao, Patanamon Thongtanunam, Christoph Treude

PDF

TL;DR

This paper introduces fine-grained confidence calibration methods for LLMs in automated code revision, improving the reliability of confidence scores for better decision-making.

Contribution

It proposes local Platt-scaling applied to fine-grained confidence scores, enhancing calibration accuracy over traditional global methods in code editing tasks.

Findings

01

Fine-grained scores achieve lower calibration error.

02

Calibration improves across multiple tasks and models.

03

Combining local and global calibration yields the best results.

Abstract

In today's AI-assisted software engineering landscape, developers increasingly depend on LLMs that are highly capable, yet inherently imperfect. The tendency of these models to produce incorrect outputs can reduce developer productivity. To this end, a canonical mitigation method is to provide calibrated confidence scores that faithfully reflect their likelihood of correctness at the instance-level. Such information allows users to make immediate decisions regarding output acceptance, abstain error-prone outputs, and better align their expectations with the model's capabilities. Since post-trained LLMs do not inherently produce well-calibrated confidence scores, researchers have developed post-hoc calibration methods, with global Platt-scaling of sequence-level confidence scores proving effective in many generative software engineering tasks but remaining unreliable or unexplored for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.