Can Knowledge Editing Really Correct Hallucinations?
Baixiang Huang, Canyu Chen, Xiongxiao Xu, Ali Payani, Kai Shu

TL;DR
This paper introduces HalluEditBench, a comprehensive benchmark for evaluating knowledge editing methods in correcting hallucinations in large language models, highlighting their strengths and limitations across multiple dimensions.
Contribution
It constructs a large-scale hallucination dataset and provides a holistic evaluation framework for knowledge editing methods, addressing a key gap in assessing their effectiveness.
Findings
Knowledge editing methods vary in efficacy across different dimensions.
The benchmark reveals limitations in current methods' generalization and robustness.
Insights from the evaluation can guide future improvements in knowledge editing.
Abstract
Large Language Models (LLMs) suffer from hallucinations, referring to the non-factual information in generated content, despite their superior capacities across tasks. Meanwhile, knowledge editing has been developed as a new popular paradigm to correct erroneous factual knowledge encoded in LLMs with the advantage of avoiding retraining from scratch. However, a common issue of existing evaluation datasets for knowledge editing is that they do not ensure that LLMs actually generate hallucinated answers to the evaluation questions before editing. When LLMs are evaluated on such datasets after being edited by different techniques, it is hard to directly adopt the performance to assess the effectiveness of different knowledge editing methods in correcting hallucinations. Thus, the fundamental question remains insufficiently validated: Can knowledge editing really correct hallucinations in…
Peer Reviews
Decision·ICLR 2025 Poster
The paper offers a foundational contribution to the study of hallucination correction in LLMs, posing a critical question that has been overlooked in the field: Can knowledge editing effectively address LLM hallucinations? The authors’ approach demonstrates originality by establishing five distinct evaluation facets (Efficacy, Generalization, Portability, Locality, and Robustness), each of which extends beyond conventional measures to comprehensively assess knowledge editing impacts. These dime
The paper could benefit from providing a more comprehensive description of the dataset construction process, which currently lacks sufficient detail. There is limited information on how domains and topics were sampled or selected from the knowledge source. The paper does not indicate whether the dataset predominantly features popular entities or long-tail entities (e.g., [Sun et al., 2023](https://arxiv.org/abs/2308.10168)), which may impact the generalizability of findings across different dist
A major strength of the paper is clearly that the results presented show that the effectiveness of knowledge editing techniques can vary greatly and is not necessarily as great on actual hallucinations of the LLMs as the experiments on the previous existing data sets show. The fact that the domain of the fact and the model itself have a decisive impact on the efficiency (and other scores) is an important point and should be taken into account. The article convincingly and clearly demonstrates
I missed the authors answering the question from the title of the article more conclusively or at least clearly addressing it again in their conclusion, as this question can basically be answered in the negative (at least in part) based on the results presented. There is also no corresponding outlook. The summary is therefore somewhat abbreviated. Although the structure of the paper is good and clearly laid out, the section on related work seems like an appendix. It could be moved to before the
S1. This paper presents HalluEditBench, a benchmarking framework that evaluates knowledge editing methods against a dataset of over 6,000 hallucinations across 9 domains and 26 topics. This facilitates future research on leveraging knowledge editing techniques to mitigate hallucinations in LLMs. S2. This paper introduces some novel insights, such as: The current assessment of knowledge editing could be unreliable; The manifestation of hallucination depends on question design; Editing methods ma
W1. The paper mentions that existing datasets for knowledge editing fail to verify whether knowledge editing methods can effectively correct hallucinations in large language models. The paper should provide at least one example to illustrate the shortcomings of other benchmarks in this regard. For instance, it could include statistics from other datasets or highlight differences in dataset construction methods compared to the approach proposed here, underscoring the innovations introduced by thi
Code & Models
Videos
Taxonomy
TopicsMental Health and Psychiatry · Hallucinations in medical conditions · Functional Brain Connectivity Studies
