The Effects of Data Augmentation on Confidence Estimation for LLMs
Rui Wang, Renyu Zhu, Minmin Lin, Runze Wu, Tangjie Lv, Changjie Fan, Haobo Wang

TL;DR
This paper investigates how various data augmentation techniques influence confidence estimation in large language models, demonstrating that diverse augmentations improve reliability and reduce overconfidence, with random combinations being particularly effective.
Contribution
It provides a comprehensive analysis of data augmentation effects on confidence estimation in LLMs, highlighting the importance of diversity and transferability of augmentation strategies.
Findings
Data augmentation improves confidence estimation performance.
Greater data diversity enhances augmentation effectiveness.
Random combination of augmentations is a promising approach.
Abstract
Confidence estimation is crucial for reflecting the reliability of large language models (LLMs), particularly in the widely used closed-source models. Utilizing data augmentation for confidence estimation is viable, but discussions focus on specific augmentation techniques, limiting its potential. We study the impact of different data augmentation methods on confidence estimation. Our findings indicate that data augmentation strategies can achieve better performance and mitigate the impact of overconfidence. We investigate the influential factors related to this and discover that, while preserving semantic information, greater data diversity enhances the effectiveness of augmentation. Furthermore, the impact of different augmentation strategies varies across different range of application. Considering parameter transferability and usability, the random combination of augmentations is a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Simulation Techniques and Applications · Advanced Data Storage Technologies
