Task Matters: Knowledge Requirements Shape LLM Responses to Context-Memory Conflict
Kaiser Sun, Fan Bai, Mark Dredze

TL;DR
This paper investigates how large language models handle conflicts between contextual information and their parametric memory, revealing task-dependent biases and the impact of strategies like rationales on their performance.
Contribution
It introduces a diagnostic framework to analyze LLM responses under controlled context-memory conflicts across different tasks, highlighting task-specific reliance and evaluation biases.
Findings
Performance degradation depends on task-specific knowledge reliance and conflict plausibility.
Strategies like rationales increase context reliance, aiding context-only tasks but harming knowledge-dependent tasks.
Evaluation biases are introduced by context-memory conflict, questioning LLMs' reliability as judges.
Abstract
Large language models (LLMs) draw on both contextual information and parametric memory, yet these sources can conflict. Prior studies have largely examined this issue in contextual question answering, implicitly assuming that tasks should rely on the provided context, leaving unclear how LLMs behave when tasks require different types and degrees of knowledge utilization. We address this gap with a model-agnostic diagnostic framework that holds underlying knowledge constant while introducing controlled conflicts across tasks with varying knowledge demands. Experiments on representative open-weight and proprietary LLMs show that performance degradation under conflict is driven by both task-specific knowledge reliance and conflict plausibility; that strategies such as rationales or context reiteration increase context reliance, helping context-only tasks but harming those requiring…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
