LitCab: Lightweight Language Model Calibration over Short- and Long-form   Responses

Xin Liu; Muhammad Khalifa; Lu Wang

arXiv:2310.19208·cs.CL·March 14, 2024·1 cites

LitCab: Lightweight Language Model Calibration over Short- and Long-form Responses

Xin Liu, Muhammad Khalifa, Lu Wang

PDF

Open Access 1 Repo 1 Video

TL;DR

LitCab is a lightweight, parameter-efficient calibration method for language models that significantly improves their probability estimates across diverse text generation tasks, enhancing trustworthiness without extensive fine-tuning.

Contribution

The paper introduces LitCab, a simple linear layer-based calibration method that outperforms existing techniques in calibrating large language models with minimal additional parameters.

Findings

01

Larger models are better calibrated on short tasks but not on longer ones.

02

GPT models are better calibrated than LLaMA and Vicuna, despite fewer parameters.

03

Limited fine-tuning can worsen calibration, emphasizing the importance of training setup.

Abstract

A model is considered well-calibrated when its probability estimate aligns with the actual likelihood of the output being correct. Calibrating language models (LMs) is crucial, as it plays a vital role in detecting and mitigating hallucinations of LMs as well as building more trustworthy models. However, standard calibration techniques may not be suited for LM calibration. For instance, post-processing methods such as temperature scaling do not reorder the candidate generations. On the other hand, training-based methods require fine-tuning the entire model, which is impractical for LMs of large scale. We present LitCab, a lightweight calibration mechanism consisting of a single linear layer that takes the input text representation and predicts a bias term, which is then added to the LM output logits. LitCab improves model calibration by only adding < 2% of the original model parameters.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

launchnlp/litcab
pytorchOfficial

Videos

LitCab: Lightweight Language Model Calibration over Short- and Long-form Responses· slideslive

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis

MethodsMulti-Head Attention · Attention Is All You Need · Byte Pair Encoding · Dense Connections · Cosine Annealing · Refunds@Expedia|||How do I get a full refund from Expedia? · Residual Connection · Weight Decay · Softmax · Adam