A Novel Approach to Eliminating Hallucinations in Large Language Model-Assisted Causal Discovery
Grace Sng, Yanming Zhang, Klaus Mueller

TL;DR
This paper investigates hallucinations in large language models used for causal discovery, proposing retrieval-augmented generation and multi-LLM debate methods to reduce false information and improve reliability.
Contribution
It is the first to survey hallucinations in LLMs for causal discovery and introduces novel techniques including RAG and multi-LLM debate for hallucination mitigation.
Findings
Hallucinations are prevalent in LLMs for causal discovery.
RAG reduces hallucinations when quality data is available.
Multi-LLM debate achieves comparable hallucination reduction.
Abstract
The increasing use of large language models (LLMs) in causal discovery as a substitute for human domain experts highlights the need for optimal model selection. This paper presents the first hallucination survey of popular LLMs for causal discovery. We show that hallucinations exist when using LLMs in causal discovery so the choice of LLM is important. We propose using Retrieval Augmented Generation (RAG) to reduce hallucinations when quality data is available. Additionally, we introduce a novel method employing multiple LLMs with an arbiter in a debate to audit edges in causal graphs, achieving a comparable reduction in hallucinations to RAG.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Anomaly Detection Techniques and Applications · Biomedical Text Mining and Ontologies
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Attention Dropout · Linear Warmup With Linear Decay · WordPiece · Weight Decay · Byte Pair Encoding · Linear Layer · Softmax · BERT
