LitCab: Lightweight Language Model Calibration over Short- and Long-form Responses
Xin Liu, Muhammad Khalifa, Lu Wang

TL;DR
LitCab is a lightweight, parameter-efficient calibration method for language models that significantly improves their probability estimates across diverse text generation tasks, enhancing trustworthiness without extensive fine-tuning.
Contribution
The paper introduces LitCab, a simple linear layer-based calibration method that outperforms existing techniques in calibrating large language models with minimal additional parameters.
Findings
Larger models are better calibrated on short tasks but not on longer ones.
GPT models are better calibrated than LLaMA and Vicuna, despite fewer parameters.
Limited fine-tuning can worsen calibration, emphasizing the importance of training setup.
Abstract
A model is considered well-calibrated when its probability estimate aligns with the actual likelihood of the output being correct. Calibrating language models (LMs) is crucial, as it plays a vital role in detecting and mitigating hallucinations of LMs as well as building more trustworthy models. However, standard calibration techniques may not be suited for LM calibration. For instance, post-processing methods such as temperature scaling do not reorder the candidate generations. On the other hand, training-based methods require fine-tuning the entire model, which is impractical for LMs of large scale. We present LitCab, a lightweight calibration mechanism consisting of a single linear layer that takes the input text representation and predicts a bias term, which is then added to the LM output logits. LitCab improves model calibration by only adding < 2% of the original model parameters.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
MethodsMulti-Head Attention · Attention Is All You Need · Byte Pair Encoding · Dense Connections · Cosine Annealing · Refunds@Expedia|||How do I get a full refund from Expedia? · Residual Connection · Weight Decay · Softmax · Adam
