To Trust or Not to Trust? Enhancing Large Language Models' Situated   Faithfulness to External Contexts

Yukun Huang; Sanxing Chen; Hongyi Cai; Bhuwan Dhingra

arXiv:2410.14675·cs.CL·March 18, 2025

To Trust or Not to Trust? Enhancing Large Language Models' Situated Faithfulness to External Contexts

Yukun Huang, Sanxing Chen, Hongyi Cai, Bhuwan Dhingra

PDF

Open Access 1 Repo 1 Models

TL;DR

This paper investigates how large language models can better calibrate their trust in external contexts, especially when faced with inaccurate information, by proposing and evaluating two confidence reasoning approaches.

Contribution

It introduces the concept of situated faithfulness and proposes two novel methods, SCR and RCR, to improve LLMs' ability to assess external information reliability.

Findings

01

SCR outperforms RCR on strong reasoning models like GPT-4.

02

RCR outperforms SCR on smaller models like Llama-3-8B.

03

Fine-tuning SCR with CR-DPO improves performance across datasets.

Abstract

Large Language Models (LLMs) are often augmented with external contexts, such as those used in retrieval-augmented generation (RAG). However, these contexts can be inaccurate or intentionally misleading, leading to conflicts with the model's internal knowledge. We argue that robust LLMs should demonstrate situated faithfulness, dynamically calibrating their trust in external information based on their confidence in the internal knowledge and the external context to resolve knowledge conflicts. To benchmark this capability, we evaluate LLMs across several QA datasets, including a newly created dataset featuring in-the-wild incorrect contexts sourced from Reddit posts. We show that when provided with both correct and incorrect contexts, both open-source and proprietary models tend to overly rely on external information, regardless of its factual accuracy. To enhance situated faithfulness,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kkkevinkkkkk/situated_faithfulness
noneOfficial

Models

🤗
kkkevinkkk/Llama-3-8B-CR-DPO
model· 1 dl
1 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling