To Trust or Not to Trust? Enhancing Large Language Models' Situated Faithfulness to External Contexts
Yukun Huang, Sanxing Chen, Hongyi Cai, Bhuwan Dhingra

TL;DR
This paper investigates how large language models can better calibrate their trust in external contexts, especially when faced with inaccurate information, by proposing and evaluating two confidence reasoning approaches.
Contribution
It introduces the concept of situated faithfulness and proposes two novel methods, SCR and RCR, to improve LLMs' ability to assess external information reliability.
Findings
SCR outperforms RCR on strong reasoning models like GPT-4.
RCR outperforms SCR on smaller models like Llama-3-8B.
Fine-tuning SCR with CR-DPO improves performance across datasets.
Abstract
Large Language Models (LLMs) are often augmented with external contexts, such as those used in retrieval-augmented generation (RAG). However, these contexts can be inaccurate or intentionally misleading, leading to conflicts with the model's internal knowledge. We argue that robust LLMs should demonstrate situated faithfulness, dynamically calibrating their trust in external information based on their confidence in the internal knowledge and the external context to resolve knowledge conflicts. To benchmark this capability, we evaluate LLMs across several QA datasets, including a newly created dataset featuring in-the-wild incorrect contexts sourced from Reddit posts. We show that when provided with both correct and incorrect contexts, both open-source and proprietary models tend to overly rely on external information, regardless of its factual accuracy. To enhance situated faithfulness,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
