A Comprehensive Survey of Hallucination Mitigation Techniques in Large   Language Models

S.M Towhidul Islam Tonmoy; S M Mehedi Zaman; Vinija Jain; Anku Rani,; Vipula Rawte; Aman Chadha; Amitava Das

arXiv:2401.01313·cs.CL·January 9, 2024·112 cites

A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models

S.M Towhidul Islam Tonmoy, S M Mehedi Zaman, Vinija Jain, Anku Rani,, Vipula Rawte, Aman Chadha, Amitava Das

PDF

Open Access 1 Repo

TL;DR

This paper surveys over 32 techniques for mitigating hallucinations in large language models, categorizing them into a detailed taxonomy and analyzing their challenges to guide future research.

Contribution

It provides the first comprehensive taxonomy of hallucination mitigation methods in LLMs and analyzes their limitations and challenges.

Findings

01

Retrieval augmented generation improves factual accuracy.

02

Knowledge retrieval techniques help reduce hallucinations.

03

The taxonomy clarifies the landscape of mitigation strategies.

Abstract

As Large Language Models (LLMs) continue to advance in their ability to write human-like text, a key challenge remains around their tendency to hallucinate generating content that appears factual but is ungrounded. This issue of hallucination is arguably the biggest hindrance to safely deploying these powerful LLMs into real-world production systems that impact people's lives. The journey toward widespread adoption of LLMs in practical settings heavily relies on addressing and mitigating hallucinations. Unlike traditional AI systems focused on limited tasks, LLMs have been exposed to vast amounts of online text data during training. While this allows them to display impressive language fluency, it also means they are capable of extrapolating information from the biases in training data, misinterpreting ambiguous prompts, or modifying the information to align superficially with the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lastmile-ai/aiconfig/tree/main/cookbooks/Chain-of-Verification
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCOVID-19 diagnosis using AI · Machine Learning in Healthcare · Big Data and Digital Economy

MethodsTanh Activation · Sigmoid Activation · GloVe Embeddings · Bidirectional LSTM · Long Short-Term Memory · Location-based Attention · Softmax · Sequence to Sequence · Contextual Word Vectors · ALIGN