Can LLMs Take Retrieved Information with a Grain of Salt?
Behzad Shayegh, Mohamed Osama Ahmed, Fred Tung, Leo Feng

TL;DR
This paper evaluates how well large language models adjust their responses based on the certainty of retrieved information, revealing systematic limitations and proposing interaction strategies to improve reliability.
Contribution
It introduces a new evaluation metric, provides empirical insights into LLMs' uncertainty handling, and proposes a portable interaction strategy to enhance context-certainty obedience.
Findings
LLMs struggle to recall knowledge after uncertain contexts.
They often misinterpret expressed certainties.
The proposed interaction strategy reduces obedience errors by 25%.
Abstract
Large language models have demonstrated impressive retrieval-augmented capabilities. However, a crucial area remains underexplored: their ability to appropriately adapt responses to the certainty of the retrieved information. It is a limitation with real consequences in high-stakes domains like medicine and finance. We evaluate eight LLMs on their context-certainty obedience, measuring how well they adjust responses to match expressed context certainty. Our analysis reveals systematic limitations: LLMs struggle to recall prior knowledge after observing an uncertain context, misinterpret expressed certainties, and overtrust complex contexts. To address these, we propose an interaction strategy combining prior reminders, certainty recalibration, and context simplification. This approach reduces obedience errors by 25% on average, without modifying model weights, demonstrating the efficacy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
