SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales
Tianyang Xu, Shujin Wu, Shizhe Diao, Xiaoze Liu, Xingyao Wang, Yangyi, Chen, Jing Gao

TL;DR
SaySelf is a training framework that improves large language models' ability to express accurate, fine-grained confidence levels and generate self-reflective rationales, enhancing their reliability and interpretability.
Contribution
It introduces a novel supervised fine-tuning and reinforcement learning approach for LLMs to produce calibrated confidence estimates and rationales, surpassing previous methods.
Findings
Reduces confidence calibration error in LLMs
Maintains high task performance on diverse datasets
Generates reasonable self-reflective rationales
Abstract
Large language models (LLMs) often generate inaccurate or fabricated information and generally fail to indicate their confidence, which limits their broader applications. Previous work elicits confidence from LLMs by direct or self-consistency prompting, or constructing specific datasets for supervised finetuning. The prompting-based approaches have inferior performance, and the training-based approaches are limited to binary or inaccurate group-level confidence estimates. In this work, we present the advanced SaySelf, a training framework that teaches LLMs to express more accurate fine-grained confidence estimates. In addition, beyond the confidence scores, SaySelf initiates the process of directing LLMs to produce self-reflective rationales that clearly identify gaps in their parametric knowledge and explain their uncertainty. This is achieved by using an LLM to automatically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsArtificial Intelligence in Law · Legal Education and Practice Innovations
