Towards Objective Fine-tuning: How LLMs' Prior Knowledge Causes Potential Poor Calibration?

Ziming Wang; Zeyu Shi; Haoyi Zhou; Shiqi Gao; Qingyun Sun; Jianxin Li

arXiv:2505.20903·cs.CL·May 28, 2025

Towards Objective Fine-tuning: How LLMs' Prior Knowledge Causes Potential Poor Calibration?

Ziming Wang, Zeyu Shi, Haoyi Zhou, Shiqi Gao, Qingyun Sun, Jianxin Li

PDF

Open Access 1 Video

TL;DR

This paper investigates how LLMs' prior knowledge affects calibration during fine-tuning, revealing that known data causes overconfidence and proposing CogCalib to improve calibration without sacrificing performance.

Contribution

The paper introduces CogCalib, a cognition-aware fine-tuning framework that significantly enhances LLM calibration by addressing prior knowledge effects.

Findings

01

CogCalib reduces ECE by 57% on average compared to standard fine-tuning.

02

Prior knowledge causes overconfidence in known data, harming calibration.

03

CogCalib maintains task performance while improving calibration across multiple tasks.

Abstract

Fine-tuned Large Language Models (LLMs) often demonstrate poor calibration, with their confidence scores misaligned with actual performance. While calibration has been extensively studied in models trained from scratch, the impact of LLMs' prior knowledge on calibration during fine-tuning remains understudied. Our research reveals that LLMs' prior knowledge causes potential poor calibration due to the ubiquitous presence of known data in real-world fine-tuning, which appears harmful for calibration. Specifically, data aligned with LLMs' prior knowledge would induce overconfidence, while new knowledge improves calibration. Our findings expose a tension: LLMs' encyclopedic knowledge, while enabling task versatility, undermines calibration through unavoidable knowledge overlaps. To address this, we propose CogCalib, a cognition-aware framework that applies targeted learning strategies…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Towards Objective Fine-tuning: How LLMs' Prior Knowledge Causes Potential Poor Calibration?· underline

Taxonomy

TopicsEconomic, financial, and policy analysis · Financial Distress and Bankruptcy Prediction · Reservoir Engineering and Simulation Methods